Create a submodule repository from a folder and keep its git commit history

GabLeRoux picture GabLeRoux · Jul 1, 2013 · Viewed 41.5k times · Source

I have a web application that explores other web applications in a particular way. It contains some web demos in a demos folder and one of the demo should now have it's own repository. I would like to create a separate repository for this demo application and make it a subpackage submodule from main repository without losing its commit history.

Is it possible to keep the commit history from the files in a repository's folder and create a repository from it and use it as a submodule instead?

Answer

GabLeRoux picture GabLeRoux · Aug 8, 2013

Detailed Solution

See the note at the end of this answer (last paragraph) for a quick alternative to git submodules using npm ;)

In the following answer, you will know how to extract a folder from a repository and make a git repository from it and then including it as a submodule instead of a folder.

Inspired from Gerg Bayer's article Moving Files from one Git Repository to Another, Preserving History

At the beginning, we have something like this:

<git repository A>
    someFolders
    someFiles
    someLib <-- we want this to be a new repo and a git submodule!
        some files

In the steps below, I will refer this someLib as <directory 1>.

At the end, we will have something like this:

<git repository A>
    someFolders
    someFiles
    @submodule --> <git repository B>

<git repository B>
    someFolders
    someFiles

Create a new git repository from a folder in an other repository

Step 1

Get a fresh copy of the repository to split.

git clone <git repository A url>
cd <git repository A directory>

Step 2

The current folder will be the new repository, so remove the current remote.

git remote rm origin

Step 3

Extract history of the desired folder and commit it

git filter-branch --subdirectory-filter <directory 1> -- --all

You should now have a git repository with the files from directory 1 in your repo's root with all related commit history.

Step 4

Create your online repository and push your new repository!

git remote add origin <git repository B url>
git push

You may need to set the upstream branch for your first push

git push --set-upstream origin master

Clean <git repository A> (optional, see comments)

We want to delete traces (files and commit history) of <git repository B> from <git repository A> so history for this folder is only there once.

This is based on Removing sensitive data from github.

Go to a new folder and

git clone <git repository A url>
cd <git repository A directory>
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <directory 1> -r' --prune-empty --tag-name-filter cat -- --all

Replace <directory 1> by the folder you want to remove. -r will do it recursively inside the specified directory :). Now push to origin/master with --force

git push origin master --force

Boss Stage (See Note below)

Create a submodule from <git repository B> into <git repository A>

git submodule add <git repository B url>
git submodule update
git commit

Verify if everything worked as expected and push

git push origin master

Note

After doing all of this, I realized in my case that it was more appropriate to use npm to manage my own dependencies instead. We can specify git urls and versions, see the package.json git urls as dependencies.

If you do it this way, the repository you want to use as a requirement must be an npm module so it must contain a package.json file or you'll get this error: Error: ENOENT, open 'tmp.tgz-unpack/package.json'.

tldr (alternative solution)

You may find it easier to use npm and manage dependencies with git urls:

  • Move folder to a new repository
  • run npm init inside both repositories
  • run npm install --save git://github.com/user/project.git#commit-ish where you want your dependencies installed