smwikipedia
smwikipedia

Reputation: 64213

Does upstream branch have be a remote branch?

I ran below 2 commands successfully. So it leads me to think an upstream branch is not necessarily a remote branch. So what good is a local upstream branch?

$ git branch local_branch5 --set-upstream origin/branch5
Branch local_branch5 set up to track remote branch branch5 from origin.

$ git branch local_branch6 --set-upstream local_branch3
Branch local_branch6 set up to track local branch local_branch3.

My branchs look like this:

$ git branch -vv
  local_branch3          a758e52 Initial commit
* local_branch4          db11990 Update README.md
  local_branch5          ce0762c [origin/branch5] change on local_branch2
  local_branch6          a758e52 [local_branch3] Initial commit
  master                 064aa08 [origin/master: ahead 1] change on local master

ADD 1

As I tested, it seems the so-called upstream just create a relation between 2 branches/named commits. These branches can be both local or one local one remote. Only need to note that when push to a local upstream, the repo needs to be ..

Upvotes: 1

Views: 145

Answers (1)

torek
torek

Reputation: 488263

You are correct, local branches can have other local branches as their upstream setting. I'd reword your "ADD 1" section to say that it's the remote part of the setting that must be . for the upstream to be another ordinary, i.e., local, branch.

See how do I get git to show me which branches are tracking what? for a related question and various answers.

You can ignore all of the below. It's not really part of the answer, it just explains how we got here, and why some of the quirks are they way they are.

Deeper history / Git magic

It's also worth noting that only local branches have an upstream, and that this upstream setting is configured, at a very low level, in two parts. One of these two, though, uses a third part, even though in normal setups this third part is pretty much invisible. The reason for this is historic, going back to Git versions that were in use before 2007 or so.

If you look at your per-repository configuration—the file .git/config, or the output from git config --list --local—you will see text like this:

$ git config --list --local
[snip]
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
[snip]
branch.master.remote=origin
branch.master.merge=refs/heads/master

This means that (local) branch master has origin/master as its upstream. The two parts are branch.master.remote, which tell Git which remote to use, and branch.master.merge, which appears to refer back to your own local master again. What's up with that?

Before remote-tracking branches existed

The answer lies in the fact that Git's remote-tracking branches themselves did not exist in very early—prehistoric, as it were—Git. Back in the bad old days, your Git did not remember for you the state of some other Git's repository.

When you connected to some other (upstream) Git and pulled commits from them, you would, in effect, go fetch commits from their master branch. That is, in fact, still what happens today. Your Git would drop this information into a file, .git/FETCH_HEAD, saying: "when I looked at <url>, I found a branch named master with commit b06d364...". Your Git still does this today, too. But then your Git would just stop: the FETCH_HEAD stuff was all you got.

So, at this point you had two master-s: yours, and theirs. Both are just named master. Your Git would then have to rebase or merge your commits right away, before your FETCH_HEAD contents were overwritten by any other different upstream's results. This meant your Git needed to know that when you wanted to rebase or merge with your upstream, it should go get their master, and then rebase or merge your master using the information in .git/FETCH_HEAD under the name master.

Thus, the branch.master.* settings come in two parts: remote = origin—this used to be just a raw URL—and then merge = master. The assumption back then was, and still is today, that the action should be to merge, even though rebase is arguably a better default. The pull script would fetch from the URL, then use the contents of .git/FETCH_HEAD to get a commit hash for merging:

$ head -3 .git/FETCH_HEAD | sed 's,git://.*,<url>,'
b06d3643105c8758ed019125a4399cb7efdcce2c                branch 'master' of <url>
95d67879735cfecfdd85f89e59d993c5b4de8835        not-for-merge   branch 'maint' of <url>
4ebf3021692df4cb51da8d806fbb8b909ee7e111        not-for-merge   branch 'next' of <url>

Note how some of these—in fact, all but one—have the string not-for-merge in them. That's a hint to the original git pull script: only the master line is meant for use in merging.

Enter remotes

It was soon obvious to early Git users that it was nice to use a shorter name than a full URL, especially if you had several different other Git repositories from which you would frequently pull good work. Instead of typing in protocol://some.university.ac.uk/long/path/to/brians/repo, you could just say brian.

The exact mechanism for this was apparently somewhat of a contest, or was invented independently multiple times in different ways. I was not in on this particular bit of history myself, but traces linger even today in the git fetch documentation, under the section named remotes:

The name of one of the following can be used instead of a URL as <repository> argument:

  • a remote in the Git configuration file: $GIT_DIR/config,

  • a file in the $GIT_DIR/remotes directory, or

  • a file in the $GIT_DIR/branches directory.

Remote-tracking branches

The first option "won" and is what people actually use today. (The other two are still supported. I don't know why—they seem like they could have been deprecated and then dropped some time ago.)

Because the first option won, though, we can move on to the configured remote-tracking branches section of this same documentation:

You often interact with the same remote repository by regularly and repeatedly fetching from it. In order to keep track of the progress of such a remote repository, git fetch allows you to configure remote.<repository>.fetch configuration variables.

Typically such a variable may look like this:

[remote "origin"]
    fetch = +refs/heads/*:refs/remotes/origin/*

This (and the rest of this section) defines how remote-tracking branch names work. The heart of the idea is that, hey, there's another Git over there somewhere: we run git fetch brian to get Brian's stuff. Can't we then remember: This is the stuff Brian's Git said he had, the last time we called up his Git?

Of course, we can: we'll do that by changing Brian's master to our brian/master. The mapping, from Brian's branch names to our remote-tracking names we store in our own Git, comes from this fetch = ... line, which we control. Each fetch line—there can be more than one—has the form +src:dst, where the leading plus sign—which means the same as --force, more or less—is optional but almost always present, and the left and right sides of the colon give "their names" and "our names" for each of their references that we intend to copy.

It's almost always the case that for remote R, the src part is just refs/heads/* and the dst part begins with refs/remotes/ and then lists the remote-name R itself. Hence, when we have a remote named brian, the fetch setting is +refs/heads/*:refs/remotes/brian/*.

The two asterisks get matched up: any branch that Brian has (refs/heads/*) get force-copied (+) to our own repository, but renamed into our remote-tracking branch name space (refs/remotes/). We add another word to make sure that Brian's branches, under our refs/remotes/brian/, don't conflict with Nadia's branches, under our refs/remotes/nadia/. And then we just copy Brian's (or Nadia's, or origin's) branch name, and that becomes our remote-tracking branch name for this branch.

How this ties in with a local branch's branch@{upstream}

Most local branches have a remote-tracking branch as their upstream setting. (As this SO question notes, that's not actually required, it's just the usual case.) But, when local branch B "tracks"—i.e., has as its upstream—remote-tracking branch refs/remotes/origin/B, the configuration entries just say that branch.B.remote is origin and branch.B.merge is B. There's no explicit refs/remotes/origin/B in there at all!

This is all because of the twisty history (twistory?) above. Before there were any remote-tracking branches at all, Git's configuration listed the name of the branch as found on the remote. Branch B "merges with" the origin version of branch B. Since it was spelled that way back in 2005 or so, Git requires that it still be spelled that way today. Git then takes the old spelling and maps it through the fetch = line to figure out the correct remote-tracking branch.

Suppose, for instance, that you have branch test with upstream origin/master. That is, git config --list --local will include:

remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*

(the usual and normal pattern) and:

branch.test.remote=origin
branch.test.merge=refs/heads/master

One could be forgiven for thinking that all Git does is string together the remote name, origin, with the master part of the branch.test.merge=refs/heads/master, to get origin/master. But in fact, that's not quite what happens. Let's make another remote, but then change its fetch line:

git remote add weird <url>
git config remote.weird.fetch '+refs/heads/*:refs/remotes/twisty/*'

Now we'll run git fetch weird:

$ git fetch weird
From <url>
 * [new branch]          maint      -> twisty/maint
 * [new branch]          master     -> twisty/master
 * [new branch]          next       -> twisty/next
 * [new branch]          pu         -> twisty/pu
 * [new branch]          todo       -> twisty/todo

Note that they did not go to weird/* but to twisty/*. We can now make a branch that tracks (has as its upstream) twisty/master:

$ git checkout -b test --track twisty/master
Branch test set up to track remote branch master from weird.
Switched to a new branch 'test'
$ git rev-parse --symbolic-full-name test@{upstream}
refs/remotes/twisty/master
$ git config --list --local
[snip]
branch.test.remote=weird
branch.test.merge=refs/heads/master

The remote is weird and the merge never mentions twisty, yet the upstream really is twisty/master.

Cleaning up

Since I do not really want the test branch and the weird twisty setup, let's clean them up:

$ git checkout master
...
$ git branch -D test
Deleted branch test (was 8b2efe2a0).
$ git remote remove weird

That's all we need to do here: git remote remove weird deletes all the twisty/* remote-tracking branches, because it too obeys the fetch = configuration lines.

Upvotes: 2

Related Questions