Best Practice for Git Repositories with multiple projects in traditional n-tier design

Terry picture Terry · Apr 1, 2011 · Viewed 18.6k times · Source

I'm making the switch from a centralized SCM system to GIT. OK, I'll admit which one, it is Visual SourceSafe. So in addition to getting over the learning curve of Git commands and workflow, the biggest issue I'm currently facing is how to migrate our current repository over to Git in regards to single or some flavor of multiple repositories.

I've seen this question asked in a variety of ways, but normally just the basic..."I have applications that want to share some lower level libraries" and the canned response is always "use separate repositories" and/or "use Git submodules" without much explanation of when/why this pattern should be used (what does it overcome, what does it eliminate?) From my limited knowledge/reading on Git so far, it appears that submodules may have their own demons to battle, especially for someone new to Git.

However, what I've yet to see someone blatantly ask is, "When you have the traditional n-tier development (UI, Business, Data, and then Shared Tools) where each layer is its own project, should you use one or multiple repositories?" It is not clear to me because almost always, when a new 'feature' is added, code changes ripple through each layer.

To complicate matters with respect to Git, we've duplicated these layers across 'frameworks' to make more manageable projects/components from a developer's perspective. For the purpose of this discussion, lets call these collection of projects/layers 'Tahiti', which represents an entire 'product'.

The final 'layer' in our set up is the addition of client websites/projects which customize/expand upon Tahiti. Representing this in a folder structure might best look like:

/Clients
  /Client1
  /Client2

/UI Layer
  /CoreWebsite (views/models/etc)
  /WebsiteHelper (contains 'web' helpers appropriate for any website)
  /Tahiti.WebsiteHelpers (contains 'web' helpers only appropriate for Tahiti sites)

/BusinessLayer (logic projects for different 'frameworks')
  /Framework1.Business
  /Framework2.Business

/DataLayer
  /Framework1.Data
  /Framework2.Data

/Core (projects that are shared tools useable by any project/layer)
  /SharedLib1
  /SharedLib2

After explaining how we've expanded on the traditional n-tier design with multiple projects, I'm looking for any experience on what decision you've made with a similar situation (even the simple UI, Business, Data separation was all that you used) and what was easier/harder because of your decision. Am I right in my preliminary reading on how submodules can be a bit of pain? More pain than is worth the benefit?

My gut reaction is to one repository for Tahiti (all projects excluding the 'client projects'), then one repository for each client. The entire Tahiti source I'm guessing has to be <10k files. Here is my reasoning (and I welcome criticism)

  • It seems to me, that in Git you want to track history of 'features' vs individual 'projects/files', and even with our project separation, a 'feature' will always span multiple projects.
  • A 'feature' coded in the core site will almost always minimally effect the core website and all projects for a 'framework' (i.e. CoreWebsite, Framework1.Business, Framework1.Data)
  • A feature can easily span multiple frameworks (I'd say 10% of the features we implement would span frameworks - CoreWebsite, Framework1.Business, Framework1.Data, Framework2.Business, Framework2.Data)
  • In a similar fashion, a feature could require changes to 1 or more SharedLib projects and/or the 'UI website helper' projects.
  • Changes to client's custom code will almost always only be local to their repository and not require tracking changes to other components to see what the 'entire feature change set' was.
  • Given that a feature spans projects to see the entire scope, if each project was its own repository, it seems it would be a pain to try to analyze *all* code changes across repositories?

Thanks in advance.

Answer

Koby picture Koby · Apr 7, 2011

The reason most people advise to do separate repositories is because it separates out changes and change sets. If someone makes a change to the client projects (which you say doesn't really effect others), there is no reason for someone to update the entire code base. They can simply just get the changes from the project(s) they care about.

Git Submodules are like Externals in Subversion. You can set up your git repositories so that each one is a separate layer, and then use submodules to include the projects that are needed in the various hierarchies you have.

So if for example:

/Core -- It's own git repository that contains it's base files (as you had outlined)
  /SharedLib1
  /SharedLib2

/UI Layer -- Own git repository 
  /CoreWebsite
  /WebsiteHelper
  /Tahiti.WebsiteHelpers
  /Core -- Git Submodule to the /Core repository
    /SharedLib1
    /SharedLib2

This ensures that any updates to the /Core repository are brought into UI Layer repository. It also means that if you have to update your shared libraries you don't have to do it across 5-6 projects.