Reputation: 11

Fetch from another branch? (Basic question, should be easy but I can't get it right...)

If main splits to two branches "A" and "B" and I have a branch off of "A" that I want to merge/cherry-pick from a branch off of "B" how would I do that? I've tried a bunch of 'fetch'/checkout commands but nothing seems to be working.

I tried variations of these stack solution but I'm having some trouble getting it right. Any suggestions?

Upvotes: 1

Answers (1)

torek

Reputation: 489828

Your linked question dates back to 2012; the accepted and top answer there mentions Git 1.5.6.5. Current Git is 2.34 or 2.35 (depending on whether 2.35 is out for your system yet) and as far as I can tell, even the oldest systems come with Git 1.7 now. So some of the information in some of the answers there is outdated. Here's the more-modern (1.7 or later) situation:

If you made a single-branch clone using --depth or --single-branch, de-single-branch-ize your clone first. See How do I "undo" a --single-branch clone?
Now just run git fetch or git fetch origin (use the correct name for your remote). Wait for it to finish.

Now run git branch -r. You should see origin/ names for each branch in the Git repository over at origin. Use:

git log origin/B

to see the hash IDs of commits that are on your Git's memory of their Git's branch named B; use those commit hash IDs with git cherry-pick, if your goal it to use git cherry-pick.

Optional reading: what to know about all of this

Git is, at its heart, all about commits. It's not about files, though commits contain files. It's not about branches, though branch names help Git (and thus us) find commits. It's really about the commits. So you need to know exactly what a commit is and does for you, first, and then how you find them.

Commits

Each commit is:

Numbered. The unique number for a commit is its hash ID. This number is unique to this commit—not just this commit in this repository, but to this commit in every repository. Every Git repository that has this hash ID, has this commit in it; every Git repository that lacks this hash ID, lacks this commit. That is, the hash ID is a Globally Unique ID (GUID) or Universally Unique ID (UUID). This is how two Git repositories, when they get together to exchange commits, can tell which one has which commits. They don't look at the contents of the commits at this point, just the IDs: those are unique, so from the IDs alone, they can tell.
Read-only: no part of any commit can ever be changed. (This is necessary for the hash IDs to work, since Git itself is distributed. If you could change a commit, two Gits could get together, exchange a commit and share its hash ID, and then one Git could change the commit so that it doesn't match the other Git repository's copy, and that's not allowed.)
A container with two parts:
- There is a snapshot of all files. The files are stored in a compressed and de-duplicated fashion. Only Git can even read these files, and nothing at all can write them, once they've been written out (because they have the same kind of UUID hash IDs).
- There is some metadata, or information about the commit itself: who made it, when, and why (the log message), for instance. The metadata include a list of previous commit hash IDs, usually exactly one entry long. This list of previous commits provides edges (outgoing arcs) that, together with the commits and their hash IDs, form a Directed Acyclic Graph or DAG, which—besides the hash IDs themselves and their magic uniqueness—is what makes Git work.

Because commits are read-only and otherwise useless for getting any actual work done, we must extract a commit—with git checkout or git switch—to work on or with it. To do that, we have to give Git its hash ID. So we need to find the hash IDs. We could use git log to do that, but how does git log find hash IDs?

Branch names and other names

This is where branch names enter the picture. A branch name, in Git, holds the hash ID of one commit—just one!—which Git then calls the tip commit of that branch. By definition, that one commit is the latest or last commit that is on that branch.

Remember that each commit stores the hash ID of some earlier commit(s), usually exactly one such commit. If we carefully arrange our commits so that each one points to the one that was the latest, some time ago, we get a chain of commits. If we want, we can draw that chain.

Let's use uppercase letters to stand in for hash IDs, because the hash IDs are too ugly and random-looking for humans to bother with. So here H stands in for the last commit's hash ID, in the branch:

... <-F <-G <-H

Commit H contains a snapshot of all of our files, as of the form they had when we (or whoever) made commit H. So extracting commit H will get us those files.

Commit H contains, as part of its metadata, the hash ID of earlier commit G. Git says that G is the parent of H. Earlier commit G has both snapshot and metadata, just like commit H. We could have Git extract commit G, and then we'd get all of its files, or we can have Git follow the metadata hash ID in G back to still-earlier parent commit F.

Commit F, of course, has a snapshot and metadata, so Git can find an earlier commit from there, and use that to find a still-earlier commit, and so on, forever—or rather, until it gets back to the very first commit (commit A, presumably). This commit can't point backwards, so it just doesn't. It has an empty list of previous commit hash IDs. This lets a program like git log stop: there's nothing further back.

Since no part of any commit can ever change, we don't really need to draw the arrows as arrows, as long as we remember that they only point backwards, and Git can only work backwards through them. (Git can't go forwards, because we don't know the hash ID of a future commit. They're random-looking and unpredictable. By the time we make a new commit, it's too late to record the new commit's hash ID in the parent commit. The child can point backwards to the parent, and hence remember the parent, but parents never know their children. It's very sad.¹)

Meanwhile, a branch name points to a commit:

...--G--H   <-- master

We can have more than one branch name pointing to commit H, though:

...--G--H   <-- feature, master

We need a way, in this drawing, to know which branch name we're using. To get that, let's attach the special name HEAD, written in all uppercase like this,² to exactly one branch name:

...--G--H   <-- feature, master (HEAD)

This means we're using commit H via the name master. If we run:

git switch feature     # or git checkout feature

we get:

...--G--H   <-- feature (HEAD), master

which means we're still using commit H, but through the name feature now.

¹Don't anthropomorphize computers: they hate that! 😀

²Lowercase head works sometimes on some computers. Don't fall into the trap of using it: it fails when you start using git worktree, and it doesn't work on all systems. If you don't like typing out HEAD in all uppercase, consider using the one-character synonym @. There were a few cases where @ didn't work in ancient Git, but since Git 1.9 or so it should always work.

Making new commits updates the current branch name

Whether or not the stuff I've said so far makes any sense, it doesn't have any reason behind it until we learn how making a new commit works. We'll skip a bunch of important details here, about Git's index aka staging area, and just say that when you run git commit, Git:

gathers the metadata it needs, such as your name and email address;
gets a log message from you to put into the metadata;
gets the hash ID of the current commit by reading HEAD and the branch name;
saves all the files into a permanent snapshot (with de-duplication) to use for the new commit;
writes all of this stuff out to make the actual commit and obtain a new unique hash ID; and—here's the tricky bit:
writes the new hash ID into the current branch name.

What this means is that if we start out with:

...--G--H   <-- feature (HEAD), master

and make a new commit, we get:

...--G--H   <-- master
         \
          I   <-- feature (HEAD)

The new commit, with its new unique hash ID, points back to existing commit H. The other branch names that point to H don't change: they still point to H. But the current branch name, feature, that used to point to H, now points to I instead.

I is the new tip commit of the branch. Commit H is still on the branch, just as it used to be, but now the branch ends at commit I. If we make another new commit, we get:

...--G--H   <-- master
         \
          I--J   <-- feature (HEAD)

where the name feature now points to commit J instead of commit I.

Note: When we talk about commits on a branch, we often really mean some subset of those commits. For instance, many people would casually say branch feature to mean commits I and J only. That's wrong in an important technical sense, but just as people will say that they weigh however many kilograms, when they really mean that they mass that many,³ we should be prepared for the misstatements and interpret them correctly.

³If we ever get mass space travel, once you're in orbit, you'll weigh nothing, but you'll still mass as much as you used to. When calculating ΔV via a=F/m, don't divide by zero!

Clones and other Git repositories

When we clone an existing Git repository, we make a new repository. As always, a repository is a collection of commits. We find some commits using branch names, but that's not the only way to find commits.

In fact, when we use git clone, we point our Git to some existing Git repository and tell it to copy that repository—and it does, but it copies only the commits. It doesn't copy any of their branch names! Instead, by default at least, it takes their branch names—their master, their feature, and whatever other names they might have—and turns those into remote-tracking names.

That is, we end up with a repository in which, instead of:

...--G--H   <-- master
         \
          I--J   <-- feature

we have:

...--G--H   <-- origin/master
         \
          I--J   <-- origin/feature

These names—which are made up of the word origin, which is the name our Git used for our first remote, plus a slash and their branch name—are not branch names. These names are remote-tracking names.⁴ They work as well as branch names, except for one (deliberate) difference: you cannot give them to git switch:

$ git switch origin/master
fatal: a branch is expected, got remote branch 'origin/master'

Using them with git checkout produces what Git calls a detached HEAD, which is not something you normally want.⁵ With git switch you must provide the --detach operation. Either way you end up with:

...--G--H   <-- HEAD, origin/master
         \
          I--J   <-- origin/feature

To avoid this problem, after Git clones their commits and changes their branch names into your remote-tracking names, git clone will create one new branch name. You pick one of their branch names with the -b option to git clone, and say using that name as it gets stored in my repository, make a new branch. If you pick feature, you get:

...--G--H   <-- origin/master
         \
          I--J   <-- feature (HEAD), origin/feature

If you don't pick anything—if you omit the -b option—your Git asks their Git software which branch name they recommend. Your Git will then create that name, based on your remote-tracking name.

Either way, though, note that you have many remote-tracking names—one for each of their branch names—and one branch name of your own, now. So git clone didn't copy any of their branch names. It made one new branch name. It points to the same commit, so it's hard to tell from a copy (a bit like your mass and weight are the same value at 1g).

⁴Git calls them remote-tracking branch names or, as in the error message above, remote branch, but the word branch here just makes things more confusing, in my opinion. Leave it out and you have lost nothing of value: say remote-tracking name instead.

⁵Git uses this detached-HEAD mode internally for various purposes, including ongoing rebases, so you will find yourself in this mode during interactive or conflicted rebase operations. That's one way you would use these normally; another is to look at historical commits' files. But these are relatively rare situations, all things considered.

Branch names, remote-tracking names, `fetch`, and upstreams

Besides the fact that you can't get "on" a remote-tracking name—that git switch origin/master or git switch origin/feature fails, and git checkout puts you in detached HEAD mode for these—there's another way that a branch name differs from a remote-tracking name: Each branch name can have one upstream set. The upstream of a branch—which is usually a remote-tracking name—sets up some convenience modes for you. That's really all it does, but they are pretty useful conveniences.

In particular, if the upstream of branch B is origin/B, your Git knows that these two names are related. (Otherwise, your Git assumes they're not related, even though the two names are so similar.) Once they're set up to be related like this, git switch B or git checkout B tells you immediately whether you're in sync with origin/B, or out of sync with it.

There's a tricky bit here that people used to always-on Internet miss sometimes though. Before your Git knows whether origin/B is up to date, you must run git fetch origin (or just git fetch, which tends to default to origin here). This has your Git software reach out to other Git software, at the same site you used when you did your original git clone.

Their Git will, at this time, list out their branch names and most-recent commit hash IDs. Your Git will use this list to check on your most-recent-commits for each remote-tracking name. If they have new commits that you don't, your Git can tell: your Git either has those hash IDs, or it doesn't; if it has them, it has the right commits, and if not, it needs to get the new commits. So your Git has them send over any new commits they have, that you don't. Then your Git updates your remote-tracking names.

So, after git fetch, you have all your commits, plus all the commits you and they shared originally, plus any new commits they have that you don't. So now, after git fetch, your Git will correctly report the state. But you do have to run git fetch first. (Note that your repository may fall behind in the seconds, hours, days, weeks, or years that you go between git fetch steps.)

Once you have their commits, you then have to decide what to do with your commits: the ones that you have that they don't. You might not have any such commits. That makes it easy: with no commits, you have no major decisions to make! But if you do have commits that they don't ... well, let's draw the situation:

          I--J   <-- your-branch (HEAD)
         /
...--G--H   <-- master, origin/master
         \
          K   <-- origin/their-branch

Here, your master and their master—your origin/master—are even: all names point to commit H, the last on the master or main line. But your branch your-branch, whose name is different from the name their-branch, points to commit J. Their branch their-branch—remembered in your repository as origin/their-branch—points to commit K. What, if anything, would you like to do about this situation?

The things you can do, and what you should do and when, are well beyond the scope of this answer. But if you've set the upstream of your-branch to be origin/their-branch, your Git will tell you that you are "2 ahead" of them: that's commits I-J. Your Git will tell you that are are also "1 behind" them: that's commit K.

Humans being human, we tend to "like" it if related branches have the same name. So if your branch your-branch turns out to be doing the same job as their branch their-branch, you might want to rename your branch at this time. That's trivial to do: git branch -m their-branch will rename it and you will now have:

          I--J   <-- their-branch (HEAD)
         /
...--G--H   <-- master, origin/master
         \
          K   <-- origin/their-branch

You still have to decide how to combine your work, in commits I-J, with their work in commit K. The usual ways to do this are to use git merge or git rebase. Both commands will use the upstream setting of your current branch to know that their-branch and origin/their-branch are related, and will hence work with commits J and K automatically, if you have the upstream set. If you don't have the upstream set this way, you must tell them to use the name origin/their-branch or the hash ID of commit K (either will work fine here). Having the upstream set is convenient.

Conclusion

Practice drawing commit graphs. Which commits link to which other earlier commits? (How do you know?) Run git log --graph or git log --oneline --graph; compare the graph it drew to the one you drew.

Remember that names—whether they're branch names, tag name, or remote-tracking names—are just ways for Git to find commits. The hash ID of the commit is what really matters to Git. That's how Git will find the commit itself. The commits hold snapshots and metadata, and hence each commit finds some earlier commits. Use this information to draw your graphs.