Assuming I am the maintainer of a repo, and I want to pull in changes from a contributor, there are a few possible workflows:
cherry-pick
each commit from the remote (in order). In this case git records the commit as unrelated to the remote branch.merge
the branch, pulling in all changes, and adding a new "conflict" commit (if needed).merge
each commit from the remote branch individually (again in order), allowing conflicts to be recorded for each commit, instead of grouped all together as one. rebase
(same as cherry-pick
option?), however my understanding is that this can cause confusion for the contributor. Maybe that eliminates option 1.In both cases 2 and 3, git records the branch history of the commits, unlike 1.
What are the pro's and con's between using either cherry-pick
or merge
methods described? My understanding is that method 2 is the norm, but I feel that resolving a large commit with a single "conflict" merge, is not the cleanest solution.
Both rebase
(and cherry-pick
) and merge
have their advantages and disadvantages. I argue for merge
here, but it's worth understanding both. (Look here for an alternate, well-argued answer enumerating cases where rebase
is preferred.)
merge
is preferred over cherry-pick
and rebase
for a couple of reasons.
merge
workflow fairly easily. rebase
tends to be considered more advanced. It's best to understand both, but people who do not want to be experts in version control (which in my experience has included many colleagues who are damn good at what they do, but don't want to spend the extra time) have an easier time just merging.Even with a merge-heavy workflow rebase
and cherry-pick
are still useful for particular cases:
merge
is cluttered history. rebase
prevents a long series of commits from being scattered about in your history, as they would be if you periodically merged in others' changes. That is in fact its main purpose as I use it. What you want to be very careful of, is never to rebase
code that you have shared with other repositories. Once a commit is push
ed someone else might have committed on top of it, and rebasing will at best cause the kind of duplication discussed above. At worst you can end up with a very confused repository and subtle errors it will take you a long time to ferret out.cherry-pick
is useful for sampling out a small subset of changes from a topic branch you've basically decided to discard, but realized there are a couple of useful pieces on.As for preferring merging many changes over one: it's just a lot simpler. It can get very tedious to do merges of individual changesets once you start having a lot of them. The merge resolution in git (and in Mercurial, and in Bazaar) is very very good. You won't run into major problems merging even long branches most of the time. I generally merge everything all at once and only if I get a large number of conflicts do I back up and re-run the merge piecemeal. Even then I do it in large chunks. As a very real example I had a colleague who had 3 months worth of changes to merge, and got some 9000 conflicts in 250000 line code-base. What we did to fix is do the merge one month's worth at a time: conflicts do not build up linearly, and doing it in pieces results in far fewer than 9000 conflicts. It was still a lot of work, but not as much as trying to do it one commit at a time.