What is a good workflow for submodule forks

Ivo Jansch picture Ivo Jansch · Aug 24, 2011 · Viewed 9.7k times · Source

Suppose we have the following repository structure on github:

company:project.git
  \- company:submodule.git

A developer in my company forks the company project, making his workspace look like this:

developer:project.git
  \- company:submodule.git

This is fine for 90% of the developers since they don't change the submodule library, they only work in the project. Now suppose there's a new feature which requires improvements in the submodule. The developer charged with this converts his workspace to this:

developer:project.git
   \- developer:submodule.git

Getting there is not trivial as he needs to replace a submdule with another submodule (to git, the original and the fork of the submodule are two different things).

If this developer works on the library for a bit longer, he commits this structure to his master branch, so his fork on github always uses the forked submodule.

Once he's ready with development, he'll create a pull request. The problem is that when merging the pull request the main repository will look like this:

company:project.git
   \- developer:submodule.git

This is problematic as now every developer that tracks the company branch will end up with the developer's submodule.

To workaround the problem, before the developer makes a pull request, his master branch should be moved back to the company:submodule.git - which is just very awkward, especially since locally he'll always still want to work with developer:submodule.git.

We've tried several workflows, and the above issue is the only one where we don't have a good workflow yet.

Answer

Mark Longair picture Mark Longair · Aug 24, 2011

When the developer creates a commit with the submodule at a particular version, that's a strong statement that the supermodule works with the submodule at that exact version. If his code does actually work with the company's version of the submodule, I think the right thing to do is for the developer to:

  1. branch the supermodule
  2. checkout the company version in the submodule
  3. update .gitmodules in the supermodule, if the developer changed that from the upstream version
  4. stage and commit that change
  5. test everything
  6. issue the pull request

He can then switch back to his normal development branch in the supermodule.

One thing I don't understand about your question is the following:

Getting there is not trivial as he needs to replace a submdule with another submodule (to git, the original and the fork of the submodule are two different things).

On the contrary, the submodule can be any git repository so long as it contains the commit which the supermodule points to. If there are two different remote repositories, he can just add an extra remote in the submodule. (The developer should change .gitmodules as well if they're going to share their repository with anyone else.)


In response to your comment below, perhaps it's worth going through how to switch a submodule from pointing to one version to another. Let's suppose that the developer is using their own repositories for the super and submodule, but those are both cloned from the company's versions (i.e. so most of the history is the same), and the submodule is at the path lib. The developer now wants to switch the submodule to point to the company's version instead. They can do the following:

  1. Edit the url parameter for the submodule in .gitmodules to point to the company's repository.
  2. cd lib
  3. git remote add company developer@company:/srv/git/lib.git
  4. git fetch company
  5. git checkout -b upstream-master company/master
  6. cd ..
  7. git add .gitmodules lib
  8. git commit -m "Switch the lib submodule to point back to the company's version"

Steps 3 to 5 can just be changed to git checkout <whatever> once the remote and branch are set up.