Git Symlinks in Windows

Ken Hirakawa picture Ken Hirakawa · May 6, 2011 · Viewed 111k times · Source

Our developers use a mix of Windows and Unix based OS's. Therefore, symlinks created on Unix machines become a problem for Windows developers. In windows (msysgit), the symlink is converted to a text file with a path to the file it points to. Instead, I'd like to convert the symlink into an actual Windows symlink.

The (updated) solution I have to this is:

  • Write a post-checkout script that will recursively look for "symlink" text files.
  • Replace them with windows symlink (using mklink) with same name and extension as dummy "symlink"
  • Ignore these windows symlink by adding entry into .git/info/exclude

I have not implemented this, but I believe this is a solid approach to this problem.

Questions:

  1. What, if any, downsides do you see to this approach?
  2. Is this post-checkout script even implementable? i.e. can I recursively find out the dummy "symlink" files git creates?
  3. Has anybody already worked on such script?

Answer

Mark G. picture Mark G. · May 25, 2013

I was asking this exact same question a while back (not here, just in general) and ended up coming up with a very similar solution to OP's proposition. First I'll provide direct answers to questions 1 2 & 3, and then I'll post the solution I ended up using.

  1. There are indeed a few downsides to the proposed solution, mainly regarding an increased potential for repository pollution, or accidentally adding duplicate files while they're in their "Windows symlink" states. (More on this under "limitations" below.)
  2. Yes, a post-checkout script is implementable! Maybe not as a literal post-git checkout step, but the solution below has met my needs well enough that a literal post-checkout script wasn't necessary.
  3. Yes!

The Solution:

Our developers are in much the same situation as OP's: a mixture of Windows and Unix-like hosts, repositories and submodules with many git symlinks, and no native support (yet) in the release version of MsysGit for intelligently handling these symlinks on Windows hosts.

Thanks to Josh Lee for pointing out the fact that git commits symlinks with special filemode 120000. With this information it's possible to add a few git aliases that allow for the creation and manipulation of git symlinks on Windows hosts.

  1. Creating git symlinks on Windows

    git config --global alias.add-symlink '!'"$(cat <<'ETX'
    __git_add_symlink() {
      if [ $# -ne 2 ] || [ "$1" = "-h" ]; then
        printf '%b\n' \
            'usage: git add-symlink <source_file_or_dir> <target_symlink>\n' \
            'Create a symlink in a git repository on a Windows host.\n' \
            'Note: source MUST be a path relative to the location of target'
        [ "$1" = "-h" ] && return 0 || return 2
      fi
    
      source_file_or_dir=${1#./}
      source_file_or_dir=${source_file_or_dir%/}
    
      target_symlink=${2#./}
      target_symlink=${target_symlink%/}
      target_symlink="${GIT_PREFIX}${target_symlink}"
      target_symlink=${target_symlink%/.}
      : "${target_symlink:=.}"
    
      if [ -d "$target_symlink" ]; then
        target_symlink="${target_symlink%/}/${source_file_or_dir##*/}"
      fi
    
      case "$target_symlink" in
        (*/*) target_dir=${target_symlink%/*} ;;
        (*) target_dir=$GIT_PREFIX ;;
      esac
    
      target_dir=$(cd "$target_dir" && pwd)
    
      if [ ! -e "${target_dir}/${source_file_or_dir}" ]; then
        printf 'error: git-add-symlink: %s: No such file or directory\n' \
            "${target_dir}/${source_file_or_dir}" >&2
        printf '(Source MUST be a path relative to the location of target!)\n' >&2
        return 2
      fi
    
      git update-index --add --cacheinfo 120000 \
          "$(printf '%s' "$source_file_or_dir" | git hash-object -w --stdin)" \
          "${target_symlink}" \
        && git checkout -- "$target_symlink" \
        && printf '%s -> %s\n' "${target_symlink#$GIT_PREFIX}" "$source_file_or_dir" \
        || return $?
    }
    __git_add_symlink
    ETX
    )"
    

    Usage: git add-symlink <source_file_or_dir> <target_symlink>, where the argument corresponding to the source file or directory must take the form of a path relative to the target symlink. You can use this alias the same way you would normally use ln.

    E.g., the repository tree:

    dir/
    dir/foo/
    dir/foo/bar/
    dir/foo/bar/baz      (file containing "I am baz")
    dir/foo/bar/lnk_file (symlink to ../../../file)
    file                 (file containing "I am file")
    lnk_bar              (symlink to dir/foo/bar/)
    

    Can be created on Windows as follows:

    git init
    mkdir -p dir/foo/bar/
    echo "I am baz" > dir/foo/bar/baz
    echo "I am file" > file
    git add -A
    git commit -m "Add files"
    git add-symlink ../../../file dir/foo/bar/lnk_file
    git add-symlink dir/foo/bar/ lnk_bar
    git commit -m "Add symlinks"
    
  2. Replacing git symlinks with NTFS hardlinks+junctions

    git config --global alias.rm-symlinks '!'"$(cat <<'ETX'
    __git_rm_symlinks() {
      case "$1" in (-h)
        printf 'usage: git rm-symlinks [symlink] [symlink] [...]\n'
        return 0
      esac
      ppid=$$
      case $# in
        (0) git ls-files -s | grep -E '^120000' | cut -f2 ;;
        (*) printf '%s\n' "$@" ;;
      esac | while IFS= read -r symlink; do
        case "$symlink" in
          (*/*) symdir=${symlink%/*} ;;
          (*) symdir=. ;;
        esac
    
        git checkout -- "$symlink"
        src="${symdir}/$(cat "$symlink")"
    
        posix_to_dos_sed='s_^/\([A-Za-z]\)_\1:_;s_/_\\\\_g'
        doslnk=$(printf '%s\n' "$symlink" | sed "$posix_to_dos_sed")
        dossrc=$(printf '%s\n' "$src" | sed "$posix_to_dos_sed")
    
        if [ -f "$src" ]; then
          rm -f "$symlink"
          cmd //C mklink //H "$doslnk" "$dossrc"
        elif [ -d "$src" ]; then
          rm -f "$symlink"
          cmd //C mklink //J "$doslnk" "$dossrc"
        else
          printf 'error: git-rm-symlink: Not a valid source\n' >&2
          printf '%s =/=> %s  (%s =/=> %s)...\n' \
              "$symlink" "$src" "$doslnk" "$dossrc" >&2
          false
        fi || printf 'ESC[%d]: %d\n' "$ppid" "$?"
    
        git update-index --assume-unchanged "$symlink"
      done | awk '
        BEGIN { status_code = 0 }
        /^ESC\['"$ppid"'\]: / { status_code = $2 ; next }
        { print }
        END { exit status_code }
      '
    }
    __git_rm_symlinks
    ETX
    )"
    
    git config --global alias.rm-symlink '!git rm-symlinks'  # for back-compat.
    

    Usage:

    git rm-symlinks [symlink] [symlink] [...]
    

    This alias can remove git symlinks one-by-one or all-at-once in one fell swoop. Symlinks will be replaced with NTFS hardlinks (in the case of files) or NTFS junctions (in the case of directories). The benefit of using hardlinks+junctions over "true" NTFS symlinks is that elevated UAC permissions are not required in order for them to be created.

    To remove symlinks from submodules, just use git's built-in support for iterating over them:

    git submodule foreach --recursive git rm-symlinks
    

    But, for every drastic action like this, a reversal is nice to have...

  3. Restoring git symlinks on Windows

    git config --global alias.checkout-symlinks '!'"$(cat <<'ETX'
    __git_checkout_symlinks() {
      case "$1" in (-h)
        printf 'usage: git checkout-symlinks [symlink] [symlink] [...]\n'
        return 0
      esac
      case $# in
        (0) git ls-files -s | grep -E '^120000' | cut -f2 ;;
        (*) printf '%s\n' "$@" ;;
      esac | while IFS= read -r symlink; do
        git update-index --no-assume-unchanged "$symlink"
        rmdir "$symlink" >/dev/null 2>&1
        git checkout -- "$symlink"
        printf 'Restored git symlink: %s -> %s\n' "$symlink" "$(cat "$symlink")"
      done
    }
    __git_checkout_symlinks
    ETX
    )"
    
    git config --global alias.co-symlinks '!git checkout-symlinks'
    

    Usage: git checkout-symlinks [symlink] [symlink] [...], which undoes git rm-symlinks, effectively restoring the repository to its natural state (except for your changes, which should stay intact).

    And for submodules:

    git submodule foreach --recursive git checkout-symlinks
    
  4. Limitations:

    • Directories/files/symlinks with spaces in their paths should work. But tabs or newlines? YMMV… (By this I mean: don’t do that, because it will not work.)

    • If yourself or others forget to git checkout-symlinks before doing something with potentially wide-sweeping consequences like git add -A, the local repository could end up in a polluted state.

      Using our "example repo" from before:

      echo "I am nuthafile" > dir/foo/bar/nuthafile
      echo "Updating file" >> file
      git add -A
      git status
      # On branch master
      # Changes to be committed:
      #   (use "git reset HEAD <file>..." to unstage)
      #
      #       new file:   dir/foo/bar/nuthafile
      #       modified:   file
      #       deleted:    lnk_bar           # POLLUTION
      #       new file:   lnk_bar/baz       # POLLUTION
      #       new file:   lnk_bar/lnk_file  # POLLUTION
      #       new file:   lnk_bar/nuthafile # POLLUTION
      #
      

      Whoops...

      For this reason, it's nice to include these aliases as steps to perform for Windows users before-and-after building a project, rather than after checkout or before pushing. But each situation is different. These aliases have been useful enough for me that a true post-checkout solution hasn't been necessary.

Hope that helps!

References:

http://git-scm.com/book/en/Git-Internals-Git-Objects

http://technet.microsoft.com/en-us/library/cc753194

Last Update: 2019-03-13

  • POSIX compliance (well, except for those mklink calls, of course) — no more Bashisms!
  • Directories and files with spaces in them are supported.
  • Zero and non-zero exit status codes (for communicating success/failure of the requested command, respectively) are now properly preserved/returned.
  • The add-symlink alias now works more like ln(1) and can be used from any directory in the repository, not just the repository’s root directory.
  • The rm-symlink alias (singular) has been superseded by the rm-symlinks alias (plural), which now accepts multiple arguments (or no arguments at all, which finds all of the symlinks throughout the repository, as before) for selectively transforming git symlinks into NTFS hardlinks+junctions.
  • The checkout-symlinks alias has also been updated to accept multiple arguments (or none at all, == everything) for selective reversal of the aforementioned transformations.

Final Note: While I did test loading and running these aliases using Bash 3.2 (and even 3.1) for those who may still be stuck on such ancient versions for any number of reasons, be aware that versions as old as these are notorious for their parser bugs. If you experience issues while trying to install any of these aliases, the first thing you should look into is upgrading your shell (for Bash, check the version with CTRL+X, CTRL+V). Alternatively, if you’re trying to install them by pasting them into your terminal emulator, you may have more luck pasting them into a file and sourcing it instead, e.g. as

. ./git-win-symlinks.sh

Good luck!