Reputation: 3195
A colleague of mine pushed a branch (BranchA) to the repo.
I have then created a copy of this branch (testBranch) from BranchA.
Everything is all well and good.
The colleague then pushed up two further commits to BranchA.
I have then git pull ( to get the latest changes in from the repo)
However, I do not see the two files committed.
Repo
BranchA
Local
git checkout master
git pull
git checkout testBranch origin/BranchA
git merge master
I am not sure why I do not get to see the latest commits(files)
Workaround:
I feel as though I am missing a step in here? It feels weird, that I would have to delete the branch each time I am required to get a latest changes from a particular origin/branch
Upvotes: 0
Views: 472
Reputation: 488193
You're ascribing too much magic to branches. :-)
The way Git works is really remarkably simple. A branch name is simply a name for a single Git commit hash ID. (I also advise that you forget that git pull
even exists, but we'll see what it is, soon, and how to use it.)
Let's talk about these commit hash IDs a bit. A hash ID is a big ugly string of letters and digits, such as 0d0ac3826a3bbb9247e39e12623bbcfdd722f24c
. This uniquely identifies some Git object—typically a commit, and when we work with branch names, it's always, definitely, a commit. Each commit records the hash ID of its parent, or predecessor commit. This allows Git to string commits together into a backwards-looking chain.
What this means is that we can draw these commit chains. If we let a single uppercase letter stand in for the big ugly hash ID, we get something that looks like this:
... <-F <-G <-H <--master
The name master
holds the actual hash ID of commit H
. That lets Git find G
, in the sea of commits floating inside the repository. From H
, Git can get the hash ID of G
, which is H
's parent. So now Git can find G
. Using G
, Git can find F
, and so on, backwards, down the line. The arrows here can be read as points to: master
points to H
, H
points to G
, and so on.
The contents of each commit are completely, totally, 100% frozen / read-only. Nothing inside any commit can ever change. So we don't really need to draw the internal arrows. However, branch names do change. The way Git adds a new commit to master
is to write out a commit object, storing H
's hash ID in the new object, along with the new commit snapshot and any other metadata like your name and email address and log message. This produces a new hash, which we'll call I
rather than trying to guess it:
...--F--G--H--I
and now Git simply needs to write the hash ID of I
into the name master
, so that master
now points to I
:
...--F--G--H--I <-- master
If you have more than one branch, or if you have multiple remote-tracking names like origin/master
and origin/BranchA
, we just draw them all:
...--F--G--H <-- master, origin/master
\
I--J <-- origin/BranchA
(We'll talk more about remote-tracking names in a moment. They are kind of like branch names, but with a twist.)
When you create a new branch name, all Git has to do is make the new name point to some existing commit. For instance, let's create our own BranchA
now, using git checkout BranchA
:1
...--F--G--H <-- master, origin/master
\
I--J <-- BranchA, origin/BranchA
Now let's create testBranch
as well, also pointing to commit J
:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch, BranchA, origin/BranchA
If you create a new commit now, your Git needs to know which branch name to update. So your Git has this special name, HEAD
, written in all-capitals like this.2 Git attaches this name to one of your branch names:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch (HEAD), BranchA, origin/BranchA
which means that testBranch
is the current branch, and is therefore the name that Git will update when you run git commit
to make a new commit. One of the things git checkout
does is to manage this HEAD-attachment.
1Since you don't have a BranchA
, you might think: How can I check it out? In fact, you should think that: it's a really good question. The answer is that your Git will create your own BranchA
from the remote-tracking name. That's why you had to git checkout -b testBranch
but not git checkout -b BranchA
: the -b
flag says create, and without it, Git will only create if the name doesn't exist and there's a remote-tracking name that does exist that looks right. There's more to it than this, but that's a good start.
2Due to a quirk, you can usually use lowercase head
on Windows and MacOS, but not on Unix-like systems like Linux. It's advisable to avoid this habit, since it won't work on Linux: if you don't like typing HEAD
in all caps, use @
, which is a synonym for the magic name.
The thing about these branch names is that they're specific to your Git repository. Your master
is your master
. Your BranchA
is your BranchA
and your testBranch
is yours, too. They won't change unless you change them.
In fact, even your remote-tracking names—origin/master
and origin/BranchA
—are yours too, but what makes them remote-tracking names is that your Git will automatically change them, to remember what your Git sees in some other Git, whenever your Git calls up their Git and asks them about their branch names. That is, your Git has the URL for some other Git repository, listed under the remote name origin
: origin
is a short name for some long, maybe-hard-to-type URL. You can run:
git fetch origin
and your Git will call up their Git, at the URL listed under origin
, and ask their Git about their branches. They'll say: Oh, sure, here you go: my master
is <hash1> and my BranchA
is <hash2>. (To see this, run git ls-remote origin
, which is like git fetch origin
except that after getting the listing of remote names and hashes, it just prints them out.)
With this list in hand, your Git goes on to ask their Git for any new commits they have that you don't. So if they've updated their BranchA
, you get their new commits. Then, regardless of what else has happened, your Git now sets all of your remote-tracking names that start with origin/
. That is, suppose their had two new commits. Your own repository now looks like this:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch (HEAD), BranchA
\
K--L <-- origin/BranchA
Your own BranchA
and testBranch
have not moved. These are your branches, so they only move when you move them. Your origin/master
hasn't moved because their master
hasn't moved, but your origin/BranchA
has moved, to remember new commit L
that you just got from them, because their BranchA
did move, and now points to this same commit L
.
(Remember, our uppercase letters stand in for actual big ugly unique hash IDs. If they made new commits, and you've made new commits, Git guarantees that their new hash IDs are different from every new commit hash you've made! You can see that with an active repository, single uppercase letters would run out way too fast, and be too hard to make unique. But they're a lot easier to draw and make it easier for us to talk about the commits, so that's why I use them here.)
Now that they've updated their BranchA
, you might want to have your own BranchA
move too. This is where things can start to get complicated, but let's look at an easy way to do that.
We'll start by running git checkout BranchA
again. This will attach HEAD
to BranchA
, so that Git commands that use the current branch are using BranchA
. Then we'll use git merge
, which in this case, doesn't actually do any merging!
git checkout BranchA
git merge origin/BranchA
Before the git merge
, we have this in our repository:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch, BranchA (HEAD)
\
K--L <-- origin/BranchA
The git merge
looks at origin/BranchA
and finds that it's pointing to L
. It looks at our current branch—the one HEAD
is attached to—and finds that it's pointing to J
. It realize that, by starting at L
and working backwards, it can get straight to J
. This means that the branch name BranchA
can be "slid forwards", as it were, against the direction of the internal, backwards-pointing arrows. Git calls this operation a fast-forward. In the context of git merge
, it's more like a git checkout
that moves the current branch name. That is, commit L
becomes the current commit, but it does so by moving the name BranchA
. The result is:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch
\
K--L <-- BranchA (HEAD), origin/BranchA
You now have commit L
as your current commit, and commit L
is filling in the index and the work-tree. It's time to talk a little bit about these two.
We already mentioned that files stored inside commits are completely, totally, 100% frozen / read-only. They're stored in a special, compressed, Git-only format. This lets Git save a lot of space, and re-use unchanged files: if a new commit has mostly the same files as the previous commit, there's no need to save all the files. The old commit's copies are frozen, so the new commit can just share them. (The details by which this process works don't really matter here, but Git uses hash IDs, with what Git calls blob objects, to achieve this trick.)
That's great for Git, but we can't use frozen compressed Git-only files to do anything else. So Git has to thaw out and de-compress the frozen files, into their normal everyday form, so that we and the rest of the programs on our computer can use them.
The thawed-out files go into the work-tree, which is called that because that's where we work on them. Here, we can do anything we want with our files. So, for each file, there's a frozen copy in the current commit, and a thawed copy in the work-tree. (There may be frozen copies in other commits too, but the one in the current commit is the most interesting, since we can and will often compare it to the one in the work-tree.)
The index, which is also called the staging area or sometimes the cache, is a peculiar thing, unique to Git. Other version control systems also have frozen commits and thawed work-trees, but either don't have an index, or keep anything index-like totally hidden so that you don't need to know about it. Git, on the other hand, will, now and then, whack you in the face with the index. You must know about it, even if you don't use it for fancy tricks.
What the index holds is, essentially, a copy of each file. That is, each file in the current commit is also in the index. The index copy is in the special Git-only format. Unlike the frozen commit copy, though, this one is only semi-frozen—kind of slushy, if you will. You can replace it any time with a new, different, Git-ified and semi-frozen copy. That's what git add
does: it Git-ifies the work-tree copy of the file, compressing it into the Git-only format and replacing the previous index copy. (If the new one matches any old one, in any frozen Git commit, it winds up re-using that old one: saving space! Otherwise it's a new Git-ized copy.)
Making a new commit, in Git, just needs to flash-freeze these index copies. They're all already ready for that, which is a significant part of why git commit
is so much faster than other version control systems. But it also means that the index can be described as what will go into your next commit. Git builds new commits from the index, not from the work-tree.
You need the work-tree to work on your files. Git needs, and uses, the index to make new commits. The index and work-tree copies can differ; it's part of your job to git add
the work-tree copies, to overwrite the index copies with updated ones, before committing.
testBranch
With all that out of the way, let's look now at updating your testBranch
. Remember, we ran git fetch
to update all our origin/*
names, then git checkout BranchA
and git merge origin/BranchA
to update BranchA
, so that we now have this:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch
\
K--L <-- BranchA (HEAD), origin/BranchA
We now need to git checkout testBranch
to attach HEAD
to it. Then we can run git merge BranchA
or git merge origin/BranchA
:
git checkout testBranch
git merge <anything that identifies commit L>
The idea here is to make Git look at commit L
. The merge command will then see whether or not it's possible to do the same fast-forward operation it did for BranchA
. The answer will be yes: it's definitely possible to go from commit J
straight to commit L
. So by default, Git will do just that, and you will get this:
...--F--G--H <-- master, origin/master
\
I--J
\
K--L <-- testBranch, BranchA, origin/BranchA
Note that we can do this even if we never create our own BranchA
, because instead of git merge BranchA
we can run git merge origin/BranchA
. That is, if we have:
...--F--G--H <-- master, origin/master
\
I--J <-- testBranch (HEAD)
\
K--L <-- origin/BranchA
and run git merge origin/BranchA
, Git will do the exact same fast-forward that it would have done with the version with a name BranchA
pointing to commit L
. What matters here are not the branch names, but rather the commits. Well, our own branch names, like testBranch
, matter, in that we need to make them point where they should; but the other names—the remote-tracking names—we only use them to find the commits. They're just more readable than hash IDs, and our Git will automatically update them on git fetch
.
Hence, suppose we never created BranchA
in the first place. Suppose instead we did:
$ git clone <url>
$ cd <repository>
$ git checkout -b testBranch origin/BranchA
... wait until colleague updates origin/BranchA ...
$ git fetch # defaults to using origin
$ git merge origin/BranchA
then we'd be done, without having to fiddle with our BranchA
that we never even created.
I'm going to omit what happens, here, if you make your own commits. In this case, you get a true merge—git merge
will see that it's not possible to just fast-forward, and will run the process of merging, and then make a commit of type merge commit. Instead, let's just address the last bit of the puzzle, git pull
.
git pull
(don't use it!)My advice for git pull
is that as a beginner, you should studiously avoid it. However, other people and documentation will tell you to use it, so you should at least know what it does. All that git pull
is and does is to run two Git commands for you. It's meant to be convenient. The problem is, sometimes it is convenient, and sometimes it's remarkably not-convenient. It's much better, in my opinion, to learn to use the two underlying Git commands first.
The first Git command that git pull
runs is just git fetch
. We already saw that that does: it calls up some other Git, gets a list from it of its branch names (and tag names) and hash IDs, and brings into your repository whatever commits you need, so that your Git can update all your remote-tracking names. Then it's done: nothing has happened to your index and work-tree. It's safe to run git fetch
at any time, because it just adds new commits and updates remote-tracking names.
The second command that git pull
runs is where the trouble comes in. You can choose which second command it runs. Normally, that's git merge
, which does what we saw above. But you can make it run git rebase
, which we have not covered here.
In either case, git pull
passes some extra arguments to the git merge
or git rebase
command. These extra arguments cause some of the inconvenience, because they are different from the arguments you might want to use. In particular, if you run:
git pull origin master
this has the effect of running:
git fetch origin master
git merge -m "merge branch 'master' of $url" origin/master
Note the slash here in the last argument—Git is going to merge the commit now identified by your origin/master
. The -m
(message) contains the URL taken from origin
, plus the name master
, rather than the name origin/master
, but the effect of the merge—whether fast-forward or real merge—is the same as merging your updated remote-tracking name, origin/master
.3
If you use separate git fetch
and git merge
commands, they make more sense. When you use git pull
, the branch name you list, if you list one, is the name on the other Git, rather than the remote-tracking name in your Git.
The same holds even if you have git pull
run git rebase
for you. And, in the last twist of being not-convenient, the decision of whether to use merge or rebase is one you sometimes should make after running git fetch
. That is, you should look at what git fetch
fetches, to decide which second command to run. But if you use git pull
, you must make this decision before you run git fetch
, so you can't look.
Once you have used Git for a while, and are very familiar with both git merge
and git rebase
, then you can start using git pull
safely. (But I still mostly don't.)
3There's another wrinkle here, with fairly old versions of Git: before Git version 1.8.4, git pull
didn't update the remote-tracking name. Modern Git does away with this weird quirk, but some systems still use really old Git versions, so it's important to know about.
Upvotes: 2
Reputation: 3987
You have made a new branch testBranch
from BranchA
. And your colleague pushed changes into BranchA
. But you're still at testBranch
. So your remote branch has no changes for you to pull and that explains why the commits in BranchA
isn't seen in testBranch
...--o--o--* <-- BranchA
You created a copy of brach BranchA.
git checkout -b testBranch
...--o--o--* <-- BranchA, testBranch
New commits in BranchA
A--B--C <-- BranchA
/
...--o--o--* <-- testBranch
Upvotes: 0