How do cherry-pick and revert work?

Tim picture Tim · Jan 3, 2016 · Viewed 6.9k times · Source

I am trying to understand what merge and rebase do, in terms of set operations in math.

In the following, "-" means diff (similar to taking set difference in math, but "A-B" means those in A but not in B and minus those in B not in A), and "+" means patch (i.e. taking disjoint union in math. I haven't used patch before, so I am not sure).

From Version Control with Git, by Loeliger, 2ed

  1. The command git cherry-pick commit applies the changes introduced by the named commit on the current branch. It will introduce a new, distinct commit. Strictly speaking, using git cherry-pick doesn’t alter the existing history within a repository; instead, it adds to the history.

    enter image description here

    enter image description here

    Is it correct that F' = (F-B) + Z?

  2. The git revert commit command is substantially similar to the command git cherry-pick commit with one important difference: it applies the inverse of the given commit. Thus, this command is used to introduce a new commit that reverses the effects of a given commit.

    enter image description here

    enter image description here

    Is it correct that D' = G - D?

Answer

Edward Thomson picture Edward Thomson · Jan 3, 2016

cherry-pick

Is it correct that F' = (F-B) + Z?

No, that would also introduce the changes that were introduced in C, D and E.

git-cherry-pick works by isolating the unique changes in the commit to be cherry-picked (ie, F-E in this example, ignoring additional ancestors including the merge base), and apply them to the target.

This is not done with patch application, but by using the three way merge algorithm - the parent of the commit to be cherry-picked will be used as the common ancestor, and the commit to be cherry-picked will be one side of the merge, with the target as the other side. The product of this is the changes that were included in the cherry-picked commit and in the target.

For example, if E is the parent of the commit to be cherry-picked, and its contents (acting as the common ancestor) are:

Line 1
Line 2
Line 3
Line 4
Line 5

For example, if F is the commit to be cherry-picked, and its contents are:

Line 1
Line 2
Line Three
Line 4
Line 5

And the target of the cherry-pick Z is:

LINE 1
Line 2
Line 3
Line 4
Line 5!

Then the results of a three-way merge are (with annotations about where each line came from):

LINE 1
Line 2
Line Three
Line 4
Line 5!

revert

Is it correct that D' = G - D?

Yes, roughly speaking. The changes that were unique to D have been removed from G. Like git-cherry-pick, git-revert is implemented using a three-way merge, though this time the commit to revert is treated as the common ancestor, one side is the current commit and the other side is the commit to revert's parent.

This will mean that when a line is identical between the commit to revert and the current commit, the line from its parent will be chosen instead.

If the contents of D, the commit to revert is acting as the common ancestor, and its contents are:

Line 1
Line 2
Line THREE
Line 4
Line FIVE

And the contents of C (D's parent) are:

Line 1
Line 2
Line 3
Line 4
Line 5

And the contents of G has been changed further, and its contents are:

Line One
Line 2
Line THREE
Line 4
Line FIVE

Then the results of the three-way merge will be:

Line One
Line 2
Line 3
Line 4
Line 5

Which is the result of taking the unique lines in the parent C and the target G.

Merge Commits

As torek notes (below), since these mechanisms both involve using a parent commit, these break down when there are more than one parent commit. (Ie, the commit in question is a merge and has multiple parents.) In this case, you will need to specify to git which parent to consider (using the -m flag).

Conflicts

Of course, either of these mechanisms may cause conflicts. For example, if the current conflict had further changed then you will have to resolve conflicts. For example, if in the revert example (above), a subsequent commit had also changed line 5, so G had actually been:

Line One
Line 2
Line THREE
Line 4
LINE FIVE!

Then there would be a conflict. The working directory (merged file) would be:

Line One
Line 2
Line 3
Line 4
<<<<<<<
LINE FIVE!
=======
Line 5
>>>>>>>

And you will need to decide whether you want the original change (Line 5) or the newest change (LINE FIVE!).