According to my understanding of merge conflicts, a merge conflict occurs when two people have changed the same file, and/or modified the same line in that file. So when I did a
git pull origin master
I expected a merge conflict, since the same line was different in both the versions, but it looks like git decided to overwrite my local files.
To give more information, I pushed my version on Github few days back. Then someone pulled it, worked with it, and pushed it back to github. Two of the files the other guy modified are of interest to me.
The first file is a config file, and the other guy changed the password in that. So when I pulled from github, the password in my local version was different from that on github. BUT, in my terminal, it said
Auto-merging <filename>
And, it overwrites my file and the password was the one set by the different guy.
The second file of interest is an HTML file written in a template engine(PUG). The other guy changed a lot of things in that file, like added a lot of css classes, removed some classes I had used, added links to css files and all. BUT when I pulled it, the terminal did not even mention that it was auto merging it, and just overwrote the entire file in my local repo and used the one from Github.
For both of the files, my question is, whether this is the expected behaviour from using git pull, or did I do anything wrong?
Below are the commands I used.
git checkout -b "misc"
git pull origin master
Also, I tried to just use fetch and then manually merge/commit it, but when I used fetch, nothing happened. The files did not change at all.
I have used git/github before, but never really worked extensively in a team using branches and pushing/pulling from github.
Git is behaving correctly. This is the expected (though not really "desired", in your case) result.
There's a bit at the bottom about how to work with Git to make it actually useful for you.
Besides Mykhailo Kovalskyi's answer, there's a more likely scenario. You did this:
git checkout -b "misc"
git pull origin master
The first line is straightforward enough. It's the second that's extra-complicated, because git pull
is git fetch
followed by git merge
, and both of those are a little bit complicated themselves.
Whenever you are working with branches in Git—and you're always working with branches, so this is really just "whenever you're working with Git"—it's important to keep the commit graph in mind. The graph, or DAG (Directed Acyclic Graph), is always there, usually lurking just out of sight. To see it with git log
, use --graph
, often with --oneline
. To see it with visualizers, use something like gitk
or one of the many annoying GUIs, which give you views like those shown here (this is just a randomly-chosen question on stackoverflow about what was seen in gitk
vs git-gui
).
The graph determines how merges will work, so it's very important at that time. At other times, it mostly just lurks, out of the way but ever-present. Almost everything in Git is oriented around adding commits, which adds entries to this graph.1
So, let's draw a bit of a graph, and then observe git fetch
and git merge
in action.
Here's a graph of a repository with nothing but a master
branch, with four commits on it:
o--o--o--o <-- master
The master
branch "points to" the tip-most commit. In this graph, with newer commits at the right, that's the right-most commit.
Each commit also points backwards, to its parent commit. That is, the lines in o--o--o
really should be arrows: o <- o <- o
. But these arrows all point backwards, which is annoying and mostly useless to humans, so it's nicer to just draw them as lines. The thing is that these backwards arrows are how Git finds earlier commits, because branch names only point to the tip-most commit!
Git also has the name HEAD
, which is a symbol for the "current commit". The way HEAD normally works is that it actually contains the branch name, and the branch name then points to the tip commit. We can draw this with a separate arrow:
HEAD
|
v
o--o--o--o <-- master
but that takes too much room, so I usually use this:
o--o--o--o <-- master (HEAD)
Git will discover that HEAD
is "attached to" (contains the name) master
, then follow the backwards arrow from master
to the tip commit.
Hint: use
git log --decorate
to show branch names andHEAD
. It's particularly good with--oneline --graph
: think of this as a friendly dog: Decorate, Oneline, Graph. In Git 2.1 and later,--decorate
happens automatically, so you don't have to turn it on yourself most of the time. See also this answer to Pretty git branch graphs.Note that
git log --decorate
prints the decoration asHEAD -> master
whenHEAD
points tomaster
. WhenHEAD
points directly to a commit, Git calls this a detached HEAD, and you might seeHEAD, master
instead. This formatting trick was new in Git 2.4: before that, it just showedHEAD, master
for both detached HEAD mode, and non-detached-HEAD mode, for this case. In any case, I call "non-detached" an attached HEAD, and I thinkmaster (HEAD)
shows this attachment pretty well.)
Now, the git checkout -b misc
step creates a new branch name. By default, this new branch name points to the current (HEAD) commit, so now we have:
o--o--o--o <-- master, misc (HEAD)
1In fact, you can never change a commit. Things that seem to change a commit, really work by adding a new commit, that resembles the old one, and then they cover up the old one and show you the new one instead. This makes it look like the commit has changed, but it hasn't. You also can't remove commits, or at least, not directly: all you can do is make them unreachable, from branch and tag names and the like. Once a commit is unreachable, Git's maintenance "garbage collector" eventually removes them. Making git gc
remove them now can be difficult. Git tries really hard to let you get your commits back, even if you want them gone.
But, all of this applies only to commits, hence the rule of thumb: "commit early and often". Anything you have actually committed, Git will try to let you retrieve again later, usually for up to 30 or 90 days.
git fetch
What git fetch
does can be summarized as:
In this way, Git is like The Borg. But instead of: "We are the Borg. We will add your biological and technological distinctiveness to our own," Git says "I am the Git. Your technologically-distinctive commits will be added to my own!"
So, let's see what happens when you git fetch origin
. You have this:
o--o--o--o <-- master, misc (HEAD)
They have this, which has several extra commits on their master
(and we don't care about their HEAD now):
o--o--o--o--o--o <-- master
Your Git renames their master, calling it origin/master
on your own end, so that you can keep them straight. Their two new commits are added to your repository, all Borg-like. Those new commits point back to the existing four commits, with the usual backwards arrows, but now it takes more room to draw the graph:
o--o--o--o <-- master, misc (HEAD)
\
o--o <-- origin/master
Note that none of your branches are changed. Only the origin
ones change. Your Git adds their technological uniqueness,2 and re-points your origin/master
to keep track of "where master
was on origin
the last time I checked."
2This is where those big ugly SHA-1 IDs come in. The hashes are how Git can tell which commits are unique to which repository. The key is that the same commit always makes the same hash ID, so if their Git has commit 12ab9fc7...
, and your Git has commit 12ab9fc7...
, your Git already has their commit, and vice versa. The mathematics behind all this is rather deep and beautiful.
git merge
The second half of git pull
is to run git merge
. It runs the equivalent3 of git merge origin/master
. The git merge
command starts by finding the merge base, and this is where the graph suddenly really matters.
The merge base between two commits is, loosely speaking, "the point in the graph where the lines all come back together." Usually the two commits are two branch-tips, pointed-to by two branch names. A typical, and nicely obvious, case occurs with this:
o--o <-- branch1 (HEAD)
/
o--o--o--*
\
o--o--o <-- branch2
What git merge
does is to locate the nearest common-ancestor commit, which I've drawn as *
instead of just o
here. That's the merge base. It's simply the point from which the two branches "fork off".
The goal of git merge
is to find out what "you" have changed—what you've done in branch1
since commit *
—and what "they" have changed, i.e., what has changed in branch2
since commit *
. To get those changes, Git runs two git diff
commands.
The same applies even if we draw the commits like this:
o--o--o--*--o--o <-- branch1 (HEAD)
\
o--o--o <-- branch2
This is the same graph, so it's the same merge. Git compares commit *
against the tip of branch1
("what's changed in our two commits?"), and commit *
against the tip of branch2
("what's changed in their three commits?"). Then Git does its best to combine those changes, and makes a new merge commit from the result. The exact details of all this combining-and-committing don't matter yet, because we don't have a graph like that.
What we have is this:
o--o--o--* <-- master, misc (HEAD)
\
o--o <-- origin/master
Note that I've kept the *
notion here. That's because git merge
still finds the merge base. The problem here is that the merge base is the branch tip: the name misc
points directly to commit *
.
If Git were to do git diff <commit-*> <commit-*>
, the diff would obviously be empty. Commit *
is the same as commit *
. So how can we merge these?
Git's answer is: we don't merge at all. We do what Git calls a fast forward. Note that although the internal commit arrows all point backwards, if we just imagine them pointing forwards instead, it's now easy to take the misc
branch-label and slide it forward, going down along the dog-leg and then to the right. The result looks like this:
o--o--o--o <-- master
\
o--o <-- origin/master, misc (HEAD)
So now our config file is the one in the HEAD
commit, which is the tip-most commit of misc
, which is the same commit as origin/master
.
In other words, we lost our changes to the config file, as they were overridden by their changes to the config file.
3The details of why it doesn't actually use git merge origin/master
are mostly irrelevant here, but have a lot to do with history. In the old days of Git, before version 1.8.4, some git fetch origin
s never actually bothered to update origin/master
. This was a bad design decision, and in all modern Git versions, git fetch
does update it.
If we go back to our original setup (and drop the name master
since it's in the way):
o--o--o--* <-- misc (HEAD)
\
o--o <-- origin/master
we could, instead of letting git pull
run git merge
, run our own git merge --no-ff origin/master
, to merge origin/master but not allow Git to do a fast-forward. Would this help?
Alas, no. Remember that the goal of a merge is to combine all the changes since the merge-base. So Git will run two diffs:
git diff <commit-*> <commit-*> # this diff is empty
git diff <commit-*> origin/master # this is "what they changed"
Git will then combine our changes (none) with their changes, and make a new merge commit:
o--o--o--o------o <-- misc (HEAD)
\ /
o--o <-- origin/master
We have a different graph (it's sort of a soup ladle or Big Dipper), but we took their changes, including the password change, while keeping nothing of ours (we had no changes since the merge base).
What we need is to make sure "our" changes—they have to be our changes, in Git's eyes—"look different" from "their" changes. That means we need Git to choose a different merge base.
The merge base is, as I said above, the point at which our commits and their commits begin to diverge. That means we need to make our own branch, and make sure we don't "fast forward" too much, or perhaps even at all.
So, we probably do want to avoid git pull
.4 We also may want to pick an earlier point at which we make our own branch. We want our graph's branch to maintain its own distinctiveness, as it were, from theirs. I've given a few of these commits letter-names so that I can talk about them:
A-----B <-- misc (HEAD)
/ /
o--o--o--o <-- master
\
o--C <-- origin/master
In commit A
, we change the config file to have a different password. Then we git merge
(not fast-forward) the tip of master
to pick up new stuff, without letting the password change. This step may be very manual, or totally automatic, but once it's committed, we're done: commits are permanent; they can't be changed.5
Now we can allow master
to "fast forward" as usual:
A-----B <-- misc (HEAD)
/ /
o--o--o--*--o--C <-- master, origin/master
Now, when we git merge origin/master
or git merge master
,6 the merge base will be the commit I've marked *
. If we didn't change the password from *
to B
, and they changed it from *
to C
, we'll pick up their change—but they should no longer need to change it, because we never send them commits A
and B
; we keep those to ourselves. So there should be no change to the password from *
to C
, and we'll keep our changed password when we make our new merge:
A-----B-----D <-- misc (HEAD)
/ / /
o--o--o--o--o--C <-- master, origin/master
Later, we'll pick up even more commits, merge (fast forward) them into master
, and be ready to merge again:
A-----B-----D <-- misc (HEAD)
/ / /
o--o--o--o--o--C--o--o <-- master, origin/master
This time, the merge base will be commit C
—it's the closest one that is on both misc
and their branch—and Git will diff C
vs origin/master
. Presumably, they still won't have changed the password, because we still didn't give them commit D
.
4I avoid git pull
as much as possible, but depending on how you go about this, you may be able to use it anyway, especially for master
.
5We make any ordinary new commit by moving the branch label to the new commit: remember that branch names just point to the tip-most commit. We just make a new tip commit, with its parent being the previous tip-most commit, and re-point the label, moving forward one step. But look what happens when we make a new commit that points further back, not just to the old tip commit, for its parent. Now we "rewrite history" by hiding some previous commits. (Try drawing this graph.) This is how both git commit --amend
and git rebase
work.
6Note that these do the same thing, as the tip of master
and the tip of origin/master
are the same commit. The one difference is that the default commit message will change: one will say "merge master" and the other will say "merge origin/master". (There's some fiddly stuff in Git's commit message formatting that treats master
differently from everything else, too, but we can ignore that. It's just a historical artifact.)
Because commits are so permanent, it's generally a very bad idea to put passwords into them. Anyone with access to your repository can look through historical commits and find the passwords.
Configuration files, too, generally shouldn't be committed at all, though here there's no real security issue. Instead, it's a matter of the very problem you have run into: everyone needs a different configuration. Committing yours to a shared repository makes no sense. If it's a private repository, that makes somewhat more sense, and if it's a private branch it's OK (if still sub-optimal in most cases).
It's pretty common to want some sort of sample configuration, or a default initial configuration. These should indeed be in commits. The trick is to make sure that the sample, or default initial, configuration is separate from the "live" configuration. For instance, with some systems, you'd include:
config.default
and have a little bit of code, such as:
[ -f .config ] || cp config.default .config
to set up the default configuration as the .config
file on the first run. Then with .config
in .gitignore
, it won't ever get put into the repository, so it will never be in any commits and you won't have this issue in the first place.