Is Git recommended for large (>250GB) content repositories

kaychaks picture kaychaks · Jun 16, 2009 · Viewed 18.5k times · Source

The web-application is a custom-built CMS which has several sub-applications and each one of them has code and content residing in the same directory structure. Due to the application framework's architecture the code and content are intertwined (content depends upon the code for its display and other functionalities) and hence are inseparable. The contents are not stored as BLOB rather they are stored as files and the underlying DB is used to link them. Size of sub-applications ranges from 20GB - 250GB and more (this is the killer).

The web-application will go for some enhancements in code (new sub-applications, bug-fixes etc.) and at the same time users will add/update the contents through the already live system. Hence, a deployment/release process is required and most importantly a version control system needs to be suggested for both code and content.

Git comes to the picture because of reasons - it is open-source & free, ease of branching & merging, its not centralized & hence no single-point-of-failure.

BUT after some initial research in the web, I found out some disappointing facts which are applicable to our application - using Git for large systems like ours is painful (checkout, clone, merge, push, pull) and commands are complicated ("geeky" would be more appropriate) for a developer base which is DVCS ignorant and mostly Windows users.

There is no fixed mindset for Git but if I have to go for a centralized approach (in really WORST case) then what should be the way (CVS & SVN apart). I have read about Perforce being a stable one and is also used in Google (I expect some brashes here!!).

Please share, guide and comment your views. I really require them.

Answer

pgs picture pgs · Jun 16, 2009

I just happened to be reading this blog post not one minute ago. It's a bit of a rant about the scalability of git.

Edit: Eight years later, and Git has Large File Storage (LFS), and Microsoft is open sourcing Git Virtual File System (GVFS) so they can use git to develop Windows.