Correct Usage of BFG Repo Cleaner

Bill Greer picture Bill Greer · Mar 30, 2018 · Viewed 9k times · Source

The BFG Repo Cleaner site gives an example of using the tool as follows to clean up a repository:

  1. Clone a fresh copy of your repo.

    $ git clone --mirror git://example.com/some-big-repo.git
    
  2. Run BFG to clean up your repo.

    $ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git
    
  3. Use git gc to strip out the unwanted dirty data

    $ cd some-big-repo.git
    $ git reflog expire --expire=now --all && git gc --prune=now --aggressive
    
  4. Push changes back up to the remote

    $git push
    

I understand the head branch is protected so any file in the head branch that is larger than 100M will still be there. If I run this tool as described I will lose any history of said 100M file correct? So if there is an old version of that file in an old commit it's gone and I will not be able to use it in it's previous state....correct?

Also, I have a coworker that stated the following and I am wondering if it is true:

If you push back to the repository that was mirrored in TFS the changes to your pack file won't be reflected on the remote and future clones

You have to create a new repository in TFS and push the mirror there for the remote to pick of the pack file changes.

Answer

Daniel Mann picture Daniel Mann · Mar 30, 2018

Any file still present at the HEAD of the repo will be preserved, including the history. It's to protect you from making mistakes. The idea is that you should explicitly delete the file, commit the deletion, then clean up the history to remove it.

TFS does not gc its repos; your colleague is correct. See Team Foundation Server 2015 (tfs2015) run git gc --prune=now on orgin/remote for confirmation.