What does git lfs migrate do?

Mohan picture Mohan · Aug 10, 2018 · Viewed 14.6k times · Source

I thought that git lfs migrate rewrote the history of a repo so that specified large files were kept in LFS. This means that the repo should get smaller, because it doesn't directly contain all versions of large files. However, when I run

git lfs migrate import --include="test-data/**" --include-ref=refs/heads/master

All of the files in the test-data/ directory are replaced with files that look like this:

version https://git-lfs.github.com/spec/v1
oid sha256:5853b5a2a95eaca53865df996aee1d911866f754e6089c2fe68875459f44dc55
size 19993296

And the .git folder becomes twice as large (400MB to 800MB). I am confused. What's git lfs migrate doing?

Edit: I did clean after migration

git reflog expire --expire-unreachable=now --all
git gc --prune=now

before running du. Afterwards, most of the space is used by these folders:

414M .git/objects 398M .git/lfs

Answer

Ido Ran picture Ido Ran · Oct 15, 2018

The only problem is that the original git-objects of the binary files are still in the .git folder because you didn't garbage-collected them.

You should follow the git lfs migration tutorial which explains:

The above successfully converts pre-existing git objects to lfs objects. However, the regular objects still persist in the .git directory. These will be cleaned up eventually by git, but to clean them up right away, run:

git reflog expire --expire-unreachable=now --all
git gc --prune=now

After running that your .git should be the same size, but if you'll go into it you should see that objects should be now much smaller than before the migrations and that lfs holds the rest.

The even better news is that now when other developers/applications clone the repo they will only have to download the objects directory and will then fetch only the "large-files" which they check out, not the whole history.