git pull *after* git rebase?

tisek picture tisek · Mar 17, 2017 · Viewed 35.1k times · Source

I have a feature branch, and a master branch.

Master branch has evolved and I mean to have those updates to diverging as little as possible from master branch.

So I git pull in both branches, git checkout feature/branch and finally git rebase master.

Now here I either expect everything to work smoothly or conflicts showing up that I need to resolve before continuing rebase until all master commits are re-applied successfully on feature branch.

Now what really happened in my case is something I do not understand:

$>git rebase master
First, rewinding head to replay your work on top of it...
Applying: myFirstCommitDoneOnTheBranch
Applying: myOtherCommitDoneOnTheBranch
$>git status
On branch feature/branch
Your branch and 'origin/feature/feature' have diverged,
and have 27 and 2 different commits each, respectively.
  (use "git pull" to merge the remote branch into yours)
nothing to commit, working tree clean
$>git pull
*load of conflicts*

Now, as much as I can understand he load of conflicts after the pull; I do not understand the need for a pull. Logically, it should rollback to master when it got branched, save the commits made on the branch, forward to latest commit on master and then apply the saved commits.

I do not understand to what the Applying message refers to: what is applying the commits on which version?

Answer

Enrico Campidoglio picture Enrico Campidoglio · Mar 17, 2017

tl;dr You should update both master and feature with git pull and git pull --rebase before rebasing feature on top of master. There is no need to do a git pull after you have rebased your feature branch on top of master.

With your current workflow, the reason why git status is telling you this:

Your branch and 'origin/feature' have diverged, and have 27 and 2 different commits each, respectively.

is because your rebased feature branch now has 25 new commits that aren't reachable from origin/feature (since they came from the rebase on master) plus 2 commits that are reachable from origin/feature but have different commit IDs. Those commits contain the same changes (i.e. they're patch equivalent) but they have different SHA-1 hashes because they are based off of a different commit in origin/feature than the one you rebased them on in your local repository.

Here's an example. Let's assume that this is your history before doing git pull on master:

A - B - C (master)
         \
          D - E (feature)

After git pull, master got commit F:

A - B - C - F (master, origin/master)
         \
          D - E (feature)

At that point, you rebase feature on top of master, which applies D and E:

A - B - C - F (master, origin/master)
             \
              D - E (feature)

In the meantime, the remote branch origin/feature is still based off commit C:

A - B - C - F (master, origin/master)
         \   \
          \   D' - E' (feature)
           \
             D - E (origin/feature)

If you do a git status on feature, Git will tell you that your feature branch has diverged from origin/feature with 3 (F, D', E') and 2 (D, E) commits, respectively.

Note that D' and E' contain the same changes as D and E but have different commit IDs because they have been rebased on top of F.

The solution is to do git pull on both master and feature before rebasing feature on master. However, since you may have commits on feature that you haven't yet pushed to origin, you would want to do:

git checkout feature && git pull --rebase

to avoid creating a merge commit between origin/feature and your local feature.

Update on the consequences of rebasing:

In light of this comment, I expanded on the diverging branches. The reason why git status reports that feature and origin/feature diverge after the rebase is due to the fact that rebasing brings in new commits to feature, plus it rewrites the commits that were previously pushed to origin/feature.

Consider the situation after the pull but before the rebase:

A - B - C - F (master)
         \
          D - E (feature, origin/feature)

At this point, feature and origin/feature point to the same commit E—in other words, they're in "sync". After rebasing feature on top of master, history will look like this:

A - B - C - F (master)
         \   \
          \   D' - E' (feature)
           \
             D - E (origin/feature)

As you can see, feature and origin/feature have diverged, their common ancestor being commit C. This is because feature now contains the new commit F from master plus D' and E' (read as "D prime" and "E prime") which are commits D and E applied on top of F. Even though they contain the same changes, Git considers them to be different because they have different commit IDs. Meanwhile, origin/feature still references D and E.

At this point, you've rewritten history: you've modified existing commits by virtue of rebasing them, effectively creating "new" ones.

Now, if you were to run git pull on feature this is what would happen:

A - B - C - F (master)
         \   \
          \   D' - E'- M (feature)
           \         /
             D - E - (origin/feature)

Since git pull does git fetch + git merge, this would result in the creation of the merge commit M, whose parents are E' and E.

If, instead, you ran git pull --rebase (that is, git fetch + git rebase) then Git would:

  1. Move feature to commit C (the common ancestor of feature and origin/feature)
  2. Apply D and E from origin/feature
  3. Apply F, D' and E'

However, noticing that D' and E' contain the same changes as D and E, Git would just discard them, resulting in a history looking like this:

A - B - C - F (master)
         \   
          D - E - F' (feature)
              ^
             (origin/feature)

Notice how commit F, previously reachable from feature, got applied on top of origin/feature resulting in F'. At this point, git status would tell you this:

Your branch is ahead of 'origin/feature' by 1 commit.

That commit being, of course, F'.