I have read a few different threads on removing large binary files from git commit history, but my problem is just a little bit different. Hence my question here to understand and confirm the steps--
My git repo is ~/foo
. I want to remove all *.jpg, *.png, *.mp4, *.ogv (and so on) from one of the directories inside the repo, specifically from ~/foo/public/data
.
~/foo/data > find -E . -regex ".*\.(jpg|png|mp4|m4v|ogv|webm)" \
-exec git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch {}' \
--prune-empty --tag-name-filter cat -- --all \;
~/foo/data > cd ..
~/foo > git add .gitignore
~/foo > git commit -m "added binary files to .gitignore"
~/foo > git push origin master --force
Am I on the right track above? I want to measure twice before I cut once, so to say.
Update: Well, the above gives me the error
You need to run this command from the toplevel of the working tree.
You need to run this command from the toplevel of the working tree.
..
So I went up the tree to the top level and re-ran the command, and it all worked.
The process seems right.
You can also test your clean process with a tool like bfg repo cleaner, as in this answer:
java -jar bfg.jar --delete-files *.{jpg,png,mp4,m4v,ogv,webm} ${bare-repo-dir};
(Except BFG makes sure it doesn't delete anything in your latest commit, so you need to remove those files in the current index and make a "clean" commit. All other previous commits will be cleaned by BFG)
Update 2020: for removing files, you would now use git filter-repo
(Git 2.22+, Q4 2019), since git filter-branch
or BFG
are now, 7 years later, obsolete.
git filter-repo --path fileToRemove --invert-paths