git status takes too long

Jacob Krieg picture Jacob Krieg · Jun 1, 2013 · Viewed 9.4k times · Source

I'm working on a project where the version control system is SVN and I want to use git. I did a git svn clone but git status works terribly slow (around 8 minutes). The repository has around 63000 files and most of them are libraries ignored by git. Is this normal? I did a git prune && git gc to perform a cleanup of unreachable objects and a garbage collect. I also did a git repack -Adf but this made things even worse. It takes even longer(more than 20 minutes).

What am I doing wrong? This is a visual studio project and I assume that the .gitignore file does not contain the right things. Is it possible to find out exactly which files are generated from a visual studio build and which have to be versioned?

If the .gitignore file is not the problem, how can I make my git status faster, is it normal for a project with 65000 files (around 10GB) to work that slow with git?

Answer

me_and picture me_and · Jun 5, 2013

For a repository of that size, git status and associated commands can be very slow. Git works much better when projects are teased apart and separated, while Subversion tends to encourage using single behemoth repositories containing multiple projects, so this sort of problem isn't uncommon when using Git-SVN.

Nonetheless, there're a few different solutions you can use to speed things up:

  • If you haven't already, upgrade to using a solid state disk rather than a magnetic disk. This single change made a massive difference to Git's speed when I was working on a similar repository

  • Look at the Configuration section of git help svn. That describes setting up Git-SVN to use track subfolders in the Subversion repository (eg trunk/project-a, branches/*/project-a, tags/*/project-a, …) rather than the entire repository. If this makes sense for your repository, it'll mean you can have much smaller checkouts and so much faster runs of git status.

  • Look at the Sparse Checkout section of git help read-tree. That'll talk you through setting up Git to use a sparse working copy, similar to a Subversion sparse checkout. Again, this means that there'll be fewer files Git's tracking in your working copy, and hence checking them all will again be quicker.

  • Consider setting the "assume unchanged" flag on large sections of your working copy. This will tell Git to not bother checking if the files have changed. There's two ways of doing this:

    1. To set the flag for specific folders, run something like the following:

      find <folder-name>... -type f -exec git update-index --assume-unchanged {} +
      
    2. To set the flag for the entire repository (note this will lose uncommitted changes):

      git config core.ignorestat true
      git reset --hard HEAD
      

    Take a look at the --assume-unchanged option in git help update-index and the config.ignoreStat section in git help config for some more information about how these work.

    Using these will mean you need to specify paths to commands like git diff and git add explicitly, ie commands like a bare git diff, git commit -a &c won't work.

  • Change your operating system and/or file system. According to the Git man pages (the same ones as in the previous bullet), Windows' lstat is slow, as is the CIFS file system. I suspect the ideal is something like ext3 or ext4 on Linux or some other *nix.