Reputation: 485
When I want to rebase against remote master, I use
git pull --rebase origin master
If I use
git pull --rebase origin
I receive the error
You asked to pull from the remote 'origin', but did not specify
a branch. Because this is not the default configured remote
for your current branch, you must specify a branch on the command line.
But why is it that
git rebase -i origin
works?
And in this case
git rebase -i origin master
actually results in
fatal: fatal: no such branch/commit 'master'
I have no local branch named master, but why is it not going to the remote branch in this case?
Upvotes: 0
Views: 1479
Reputation: 488183
The git pull
command is quite different from most other Git commands. I'd say that in many ways, the closest other Git command is git gc
, which—like git pull
—is a convenience wrapper to avoid needing to type in multiple separate Git commands.1
What git pull
does is:
git fetch
; thenThe first command, git fetch
, needs the name of a remote. The name origin
is the standard name for the first remote, and since most Git repositories only have one remote, origin
is the name for the first, last, and only remote in that repository.
You can leave this off—you can run git pull
with no additional arguments—and Git will figure out some appropriate remote. But if you're going to supply additional arguments, the first non-option argument is the remote name, so git pull frabjous
uses the word frabjous
as the name of the remote.
The second command is either git merge
or git rebase
.2 This second command needs a commit hash ID, such as 4c53a8c20f8984adb226293a3ffd7b88c3f4ac1a
, or something that will work in place of a commit hash ID.3 Usually, though, we use a name—a branch name like master
or main
, or dev
, or whatever—as the "something that will work" here. The general idea—the way to think of git pull
—is: get stuff from the other guy, then incorporate it. The "other guy" here is the remote, and the "stuff to get" is "any new commits he has on some branch of his". So the name you put in here, when you put in a name, is the other guy's branch name.
Note that, as with git fetch
, you can leave all of this out, and just run:
git pull
The pull command will figure out a remote to use—probably origin
—and a name to use, all on its own, based on the upstream that you have set for the current branch. The "upstream" is just a thing-you-can-set: for your branch named xyzzy
, the upstream is probably already set to origin/xyzzy
.
Note that the upstream name here, origin/xyzzy
, has a slash in it: it's made up of the name of the remote, origin
, then a slash, then the branch name as seen on the remote, xyzzy
. So if the branch name as seen on the remote is frab/jous
, you'd have origin/frab/jous
here, with two slashes: one to separate origin
from the other guy's branch name, and one in the other guy's branch name.
If you're going to put in a name at all, on your git pull
command, you must place this after the remote. Having done that, Git assumes you'll just put in the branch name as seen on the remote. So you type in:
git pull origin frab/jous
or whatever here, to mean:
git fetch origin
; thenorigin/frab/jous
to a hash ID and run git merge
or git rebase
as appropriate.Note that either of these two steps can fail entirely, and the second one can stop in the middle. If one step fails, any remaining steps don't happen at all, and you should restart from the failed point, whatever that was, if you want to pick up where it left off—so you need to know which step failed, if one fails. Luckily for most of us, git fetch
is very safe to run an extra time so we can mostly just ignore its failure-vs-success. But you still need to know whether to finish a stopped-in-the-middle merge or rebase. For this and other reasons, I always encourage Git newbies to learn the separate commands first. Recognizing when they've worked, when they've failed completely, and when they have stopped in the middle is important.
Unfortunately, that means you need to learn that oddity, where git pull
takes the other guy's name for the branch (leaving out the origin/
), and git merge
or git rebase
takes your name (including the origin/
). But you were going to have to learn this anyway. Make a note of it! Their branch names are theirs; your Git repository reads their name-and-hash-ID values from them (during the git fetch
step), and stores them in your Git repository under these origin/
-prefixed names.
This is still leaving out a lot. Git has a very steep setup learning curve. I'll take a break now for footnotes and then address one other thing.
1git gc
runs git repack
, git prune-packed
, git reflog expire
, git worktree prune
, git prune
, git pack-refs
, and/or git rerere gc
if/as appropriate. This isn't meant to be a completely exhaustive list as the list has changed at times (e.g., git worktree
didn't exist before Git 2.5) and I don't really keep track. I generated this list by glancing over the git gc
documentation. I think this particular manual page might have been the main inspiration for https://git-man-page-generator.lokaltog.net/ 😀
2There are a few special-case exceptions, including doing nothing at all if the git fetch
step fails.
3This is a bit of an oversimplification, as git merge
and git rebase
can take more than one hash ID, and for a case that is never used by git pull
, git rebase
also requires a branch name. For the purpose of being run by git pull
, though, they wind up using hash IDs here.
origin
is both a remote and ... well...But why is it that
git rebase -i origin
works?
Here's where another piece of the steep learning curve whacks you in the face.
Git is, in the end, all about commits. The commits in the repository are the reason for using Git. The individual commits are numbered, but the numbers themselves are big, ugly, random-looking things that are completely unsuited for humans. These are the hash IDs or Object IDs, that spill out from git log
for instance. They're only really usable via cut-and-paste, so we mostly don't use them after all: we use names.
As a result, Git provides not one but two key-value databases. One of these is indexed by the hash IDs, and that's how Git gains access to its commits and other internal data. Git puts in a hash ID, and gets the commit or other object whose key is that particular hash ID. When the object is a commit object, that represents a full snapshot of every file, frozen for all time in the form it had at the time you (or whoever) made the commit.
To find the hash ID, though, Git keeps a second database where the keys are names: branch names, tag names, and other sorts of names. The branch names, like master
or main
, dev
or develop
, frab/jous
, and so on, are up to you: you can choose any name you like (although it's wise to stick in a dash or slash or letter outside the [0-9a-f] set, because the "names" cafebabe
and badf00d
and deadcab
could be abbreviated hash IDs). To keep branch and tag names from bumping into each other, Git actually sticks refs/heads/
in front of each branch name, and refs/tags/
in front of each tag name.
The names that Git stores in your repository, so as to remember some other Git repository's branch names, are remote-tracking names (Git calls these remote-tracking branch names) and are actually prefixed with refs/remotes/
, so rather than origin/dev
, these are really refs/remotes/origin/dev
.
All of these names, in these various namespaces, hold one hash ID each. That's all Git needs, because commits themselves also hold other commit hash IDs. From one commit, Git can find another one. From there, Git can find yet another commit—and so on, and on. Git simply defines a branch name as "this name holds the hash ID of the commit that is to be called the latest on this branch".
So, if you're on some branch main
, the name holds some hash ID H
, which is the hash ID of some commit:
<-H <-- main
Each commit holds a list of previous-commit hash IDs, usually just one entry long, along with the snapshot of all files. That's the backwards-pointing arrow coming out of H
, here. Commit H
holds the hash ID of some earlier commit. Let's call that one G
and draw it in:
<-G <-H <-- main
Of course, G
is a commit with a snapshot and another backwards arrow, so it must point to some earlier commit, which repeats over and over:
... <-F <-G <-H <-- main
and that's a Git branch. To add a commit to a branch, we "check it out" or "switch to it" by name, making the name the current branch name and the corresponding commit H
the current commit.
We can have more than one name pointing to this commit. Let's draw in several names: main
and dev
and also origin/main
, which isn't a branch name but still points to a commit. For laziness I'll stop bothering with arrows between commits, but remember that Git only works backwards, never forwards:
...--F--G--H <-- dev, main, origin/main
We pick one branch—let's say dev
—to switch to. To remember that we're using the name dev
, we attach the special name HEAD
to it:
...--F--G--H <-- dev (HEAD), main, origin/main
Now we fiddle around the way we do with Git—which I won't cover here but the index or staging area (two terms for the same thing) is crucial—and eventually make some new commit. The new commit, which we'll call I
, has a new unique hash ID and points backwards to existing commit H
, like this:
...--F--G--H
\
I
The tricky bit is that Git updates the current branch name as soon as it has finished making new commit I
. None of the other names are updated, so they all still point to H
:
...--F--G--H <-- main, origin/main
\
I <-- dev (HEAD)
Commit I
is now the latest commit on dev
. Commits up through H
are still on dev
, and continue to be on main
as well. The special name HEAD
is still attached to dev
, and our current commit is now commit I
. Commit H
still exists (and, crucially for Git's hashing scheme, is completely untouched: this is why the arrows all go backwards, not forwards).
Okay, but—so what? Well, Git is, as I said earlier, all about the commits. When you give Git a branch name, most of the time it very quickly turns that name into a hash ID by figuring out where the name points. (The git switch
and git checkout
commands are unusual here in that they have to remember the name, too, so that you can become "on" that branch when they're done.) There's a command-line Git command that does this for you, namely git rev-parse
. If we give git rev-parse
some branch names, we can see it in action:
$ git rev-parse master
5d01301f2b865aa8dba1654d3f447ce9d21db0b5
$ git rev-parse diff-merge-base
fa1c8acabf0d5649baf87f549d67426d14255e0f
It can parse tag names too though, and remote-tracking names, and with --symbolic-full-name
it can tell us what the full spelling of each name is:
$ git rev-parse --symbolic-full-name v2.35.1
refs/tags/v2.35.1
$ git rev-parse --symbolic-full-name origin/master
refs/remotes/origin/master
$ git rev-parse origin/master
5d01301f2b865aa8dba1654d3f447ce9d21db0b5
What happens if we give it origin
alone?
$ git rev-parse origin
5d01301f2b865aa8dba1654d3f447ce9d21db0b5
$ git rev-parse --symbolic-full-name origin
refs/remotes/origin/master
Well, that's a bit peculiar, isn't it? Let's take a look at the gitrevisions documentation, which is crucially important and cleverly hidden in plain sight in a pile of 1000 largely unreadable manual pages:
SPECIFYING REVISIONS
...
<refname> e.g., master, heads/master, refs/heads/master
... a<refname>
is disambiguated by taking the first match in the following rules:
- If
$GIT_DIR/<refname>
exists, that is what you mean (this is usually useful only forHEAD
,FETCH_HEAD
,ORIG_HEAD
,MERGE_HEAD
andCHERRY_PICK_HEAD
);- otherwise,
refs/<refname>
if it exists;- otherwise,
refs/tags/<refname>
if it exists;- otherwise,
refs/heads/<refname>
if it exists;- otherwise,
refs/remotes/<refname>
if it exists;- otherwise,
refs/remotes/<refname>/HEAD
if it exists.
It's this six-step rule that makes name abbreviations work. We write:
git rebase master
and Git tries master
as a file in .git
(step 1), but that doesn't exist, so Git goes on to try refs/master
as a name (step 2). That doesn't exist either so Git tries refs/heads/master
as a name (step 3). That one does exist, in this repository anyway, so it resolves to a hash ID and the revision specifying is complete.
If we use origin/master
, step 5 finds it, because refs/remotes/origin/master
exists (use git for-each-ref
to dump out the ref table, and see that it does exist). And if we use origin
—which doesn't seem to be a ref-name at all—step 6 finds it, because refs/remotes/origin/HEAD
exists.
Now, HEAD
—and correspondingly, refs/remotes/origin/HEAD
—is a special case: it's a symbolic reference, which in Git is analogous to a symbolic link in Unix/Linux file systems. (In fact, in early Git implementations, it simply was a symbolic link. That does not work well on Windows though, so now it's a file with contents.) The git for-each-ref
command expands the link by default, but git branch -r
doesn't, so that's one way to see this.
The conclusion of all of this is:
origin/HEAD
is a symbolic ref for whatever branch is the HEAD
in origin
, usually master
or main
;origin
by itself is either a remote (as used by git fetch
), or resolvable via step 6 of gitrevisions (as used by most other Git commands);git rebase -i origin
resolves it via origin/HEAD
and step 6; butgit pull origin master
doesn't use step 6 at all: the string origin
is just a remote, and the string master
gets mapped through the remote-tracking names to become origin/master
(and in this particular case git pull
actually sidesteps all this because it's using the .git/FETCH_HEAD
file mechanisms, which predate all this stuff and go through somewhat different code paths).The git pull
command passes most of its flags and arguments on to git fetch
, except for some flags that it passes to the second command, and some flags that it uses itself. It's enormously complicated because of historical ... mistakes? ideas? concepts? anyway, history of the way Git used to work, which must be preserved in amber for the next 300 million years, or whatever. 😀 (Seriously though, the Git folks take compatibility itself quite seriously and try not to break existing uses and workflows.)
Upvotes: 0