Reputation: 249
I've been playing around with http://git-school.github.io/visualizing-git and I'm not really sure how this works - can I delete remote tracking branches? Say I have local branches master
and origin/master
and a remote repository with master
branch that corresponds to the local origin/master
.
Can I delete origin/master
? If I can and I do so, how do I set up a new remote tracking branch for it again? Would just fetching origin
automatically create it again? If someone pushes some new branch onto the remote repository say feature
, will fetch
always download this automatically and create a remote tracking branch origin/feature
in my local repository? Does fetch
always download "everything on remote repo that you're missing"?
Lastly, I know you can set what remote tracking branch a local branch tracks, say git branch -u origin/feature
(assuming I have feature
checked out) will associate feature
with origin/feature
, both local branches. In this case we call origin/feature
the upstream branch. But can I change which remote branch origin/feature
is associated with, and is this association also called "upstream" ?
I'm mostly just curious and I haven't been really able to recreate the remote tracking branch on the site I linked, after I tried deleting it. But maybe it's as simple as "fetch always creates a new remote tracking branch if it doesn't exist in your local repository".
Upvotes: 0
Views: 799
Reputation: 489828
... can I delete remote tracking branches?
Yes, but there is little point (not quite no point, just "little"). The command-line command to do this is, e.g.:
git branch -r -d origin/master
(though you might have to force the delete in some cases).
Let's be overly explicit here and define remote-tracking branch name (or as I prefer to call it, remote-tracking name) carefully first, just in case. A remote-tracking name is a name that exists in your repository, but whose full name—as shown by, e.g., git rev-parse --symbolic-full-name
or git for-each-ref
—starts with refs/remotes/
(and goes on to include the name of a remote such as origin
and another slash). These routinely get abbreviated to things like origin/master
(though for some reason git branch -a
abbreviates them as remotes/origin/master
instead—compare to git branch -r
which uses the shorter form).
If I ... do [delete
origin/master
], how do I set up a new remote tracking branch for it again? Would just fetching origin automatically create it again?
In general, yes.
If someone pushes some new branch onto the remote repository say
feature
, will fetch always download this automatically and create a remote tracking branchorigin/feature
in my local repository?
In general, yes.
Does fetch always download "everything on remote repo that you're missing"?
Mostly.
Lastly, I know you can set what remote tracking branch a local branch tracks, say
git branch -u origin/feature
(assuming I havefeature
checked out) will associatefeature
withorigin/feature
, both local branches.
Yes, though be careful about the phrase "local branches": origin/feature
is a name specific to your repository but Git tends to call it a remote-tracking branch name, which leads to casual (mis?)use of the word branch, which is why I like to just drop the word branch entirely here now and call it a remote-tracking name. That way instead of being tempted to call it a "local branch", you'll be tempted to call it a "local name", which I think is much clearer.
In this case we call
origin/feature
the upstream branch.
I prefer to just call it the upstream (again, trying to avoid beating the poor word branch to death :-) ).
But can I change which remote branch
origin/feature
is associated with ...
No, or not really: a remote-tracking name has no association, at least not of this kind.
and is this association also called "upstream"?
Without the association, we don't need a term for it. :-)
When you have your Git call up another Git, you supply a set of refspecs—optionally on the command line, but you always supply them. This is true even if you use a raw URL. The general form of a command-line git fetch
command is:
git fetch [<options>] [<repository> [<refspec>...]]
That is, there are optional options
arguments (like --tags
or --no-tags
, and/or --prune
, and so on), an optional repository
argument, and optional refspec
arguments. To supply a refspec
argument, you must first supply the repository
argument, as these are positional arguments: the first non-option argument is the repository, and subsequent non-options are refspecs. So:
git fetch origin
supplies a repository
argument and no refspec
arguments, and:
git fetch origin '+refs/heads/*:refs/remotes/origin/*'
supplies one repository
and then one refspec
.
The repository
argument can be a URL, but these days, generally should be a remote. A remote is simply a short name like origin
. Git stores at least two items under this short name:
remote.remote.url
contains the URL that Git can use for fetch and push operations;remote.remote.pushurl
(if set) contains an alternate URL Git should use instead for git push
; andremote.remote.fetch
contains the refspec arguments that Git should use, if you don't provide any on the command line.All these settings are stored in your configuration, as shown by git config -l
for instance.1 The act of creating a remote—via git remote add
, for instance—will automatically create both the URL and fetch settings for that new remote. When you run git clone
, Git creates a remote, as if by git remote add
, so that too sets the two usual settings. The default name of this automatically-created remote is origin
, so that's why a Git repository usually has an origin
: most Git repositories tend to be created via cloning. Even those that aren't tend to have a git remote add origin
run in them at some point.
Note that if you don't supply a remote or URL repository
argument, git fetch
will construct one: it will take the current branch's remote setting (git config --get branch.branch.remote
, where branch
is the current branch) to find the remote to use, or, as a last-ditch fallback, just use the hardcoded string origin
. So the default git fetch
action is to find the correct remote name, or use origin
, then use the remote.remote.url
and remote.remote.fetch
settings from there.
One way or another, then, you've run git fetch
or git fetch origin
and supplied one or more refspecs. It is the refspecs that determine what remote-tracking names will be created. The default refspec for the remote named origin
is:
+refs/heads/*:refs/remotes/origin/*
We can disassemble this refspec into its component parts:
+
, followed by:
characterwhere either source or destination (but not both, at least not sensibly) can be omitted. Both the source and destination parts can use an asterisk *
in a way that's similar to, though not precisely the same as, a shell glob.2
The leading plus sign, if present, sets the force flag for this refspec. This force flag is the same flag you get with --force
, except that the --force
option sets it for the duration of the entire git fetch
operation, while the plus sign sets it only for the duration of this one particular refspec.
So, the default origin
refspec:
refs/heads/*
; andrefs/remotes/origin/*
.This destination is precisely the set of remote-tracking names for the remote origin
, and that's where origin/master
comes from. The process is a bit convoluted, though.
At the start of the conversation your Git has with the other Git—the one at the URL—their Git lists out all their branch and tag names, plus any other refs/*
type names: all their refs or references.3 This list comes with hash IDs, because each ref always stores one hash ID. It's slightly augmented for tag refs (refs/tags/*
). To see exactly what their Git spills out that this point, you can run git ls-remote origin
, which does this first fetch step: call up the other Git and have it list out its refs. Then instead of fetching, git ls-remote
just prints the list of refs.
Now that your git fetch
has its paws on their refs, now your Git goes to apply the refspecs. Which of their refs match your refspecs? Those are the ones that your Git will inspect more closely.4
At this point, your Git inspects the hash IDs they gave you. Hash IDs are the universal currency of Git exchanges, because every Git in the universe agrees that any one particular hash ID is going to apply to that one particular object.5 Either you have the object already, in which case you have that hash ID in your own repository too, or you don't, in which case you don't. If you don't have the hash ID and do want some commit here, your Git tells their Git that, yes, it wants that hash ID. Assuming this is a commit, or annotated tag object—most of these are; see footnote 5—their Git will offer its parent(s) or, for a tag, its tag-target, and your Git can again say whether it wants the object, or not.
This process—the exchange of hash IDs, and "want" vs "already have" kind of responses—makes up the second phase of git fetch
. Eventually your Git has told their Git which objects—commits and any necessary files to go with them—that they should package up and send; and now your git fetch
, and their end, go into a third phase, of building what Git calls a thin pack. This is where you see "counting objects" and "compressing objects" and so on (if you do see them at all—this stuff is run off a timer and some of it is suppressed in some cases).
Finally, they send you this thin pack. Your Git takes the thin pack and "fattens" it into a regular pack, or otherwise incorporates the objects into your own repository. You now have all the objects you need from them, along with all of the hash IDs that correspond to all of the names your Git got from their Git. So if their refs/heads/master
—their master
branch—names commit a1234567...
, and you didn't have a1234567...
before, well, now you do. You also have the parent commits, and their parents, all the way back to the dawn of time, if needed.6 Typically, though, their a1234567...
, if new, is only new for a few commits in length, after which the parent chain leads back into something you got from them yesterday, or whenever—so instead of fetching thousands of commits, you just fetch one, or three, or a dozen, or whatever.
In any case, the conversation with their Git is now done. Your Git has all the objects (commits and associated files) that your Git needs, along with the list of their branch names. Your Git now creates or updates your remote-tracking names via the refspec you supplied, either on the command line, or implicitly in via your configuration.
1Normally these should be in the --local
level, although Git itself doesn't care where they come from: it's just weird to set these in your system or global config. The URL and (if specified) push-URL are "last setting overrides" style configuration entries, but the fetch
lines are cumulative settings. That is, if you've set these rather nonsense settings:
git config remote.origin.fetch foo:bar
git config --add remote.origin.fetch baz:quux
then git fetch origin
acts like git fetch origin foo:bar baz:quux
. So adding a remote.origin.fetch
setting to your --global
configuration would add to the standard setting, and this is potentially useful, but also potentially hazardous: you'll need to think hard about doing it.
2The degree of similarity depends on your Git vintage, as some restrictions were lifted in early 2.x versions.
3More precisely, their Git lists references that have not been marked hidden. Normally there are no hidden refs anyway, though.
4The process is modified a bit for tags, because --tags
and --no-tags
are not the default, and the default is kind of weird and surprising, but is why the tag information that the other Git hands over is augmented in the first place. I won't go into details here though.
5You mostly interact with hash IDs when talking about commit objects. These are just one of four internal object types, but they're the most important here, and branch names, such as refs/heads/master
, are constrained: they must contain only commit hash IDs, not tree or blob or annotated-tag object hash IDs. However, internally, git fetch
has ways of dealing with tree and blob hash IDs as well, to avoid re-sending file content that you already have, for instance. The details are well out of the scope of this answer.
6All of this gets modified, if desired, via what Git calls a shallow clone. In a shallow clone, some specific commits are omitted, which allows omitting all the history that comes before those commits. Shallow clones have some restrictions. The details again depend on exact Git vintage, though most of the strongest restrictions were lifted by Git version 2.0.
Using the standard remote.origin.fetch
, then, this is where your Git creates or updates your origin/master
based on what their Git said about their master
. If you have a standard fetch
setting, your Git will take all of their branches and create-or-update all of your remote-tracking names, using this one-to-one correspondence: their master
becomes your origin/master
; their feature
becomes your origin/feature
.
The mapping is determined by the refspecs, though. So you can create a single-branch clone, and in this single-branch clone you'll have:
remote.origin.fetch=+refs/heads/master:refs/remotes/origin/master
for instance. Now your Git only matches their refs/heads/master
(plus some cases of tags, but see footnote 4). So you only get your origin/master
created-or-updated.
To de-single-branch-ize this clone, you can simply change the default refspec. Or, to fetch two branches, but still just those two, you can add a second remote.origin.fetch
line:
remote.origin.fetch=+refs/heads/dev:refs/remotes/origin/dev
Now, while a remote-tracking name has no upstream setting—the upstream setting of a (regular, local) branch is in its branch.branch.remote
and branch.branch.merge
settings, and there's nothing equivalent for remote-tracking names—it is possible to set up a wildly convoluted set of refspecs. It's not a good idea, though.
Note how we mentioned above the concept of doing a one-to-one mapping from their Git's names (refs/heads/*
) to your remote-tracking names (refs/remotes/origin/*
). If you do this:
remote.origin.fetch=+refs/heads/master:refs/remotes/origin/master
remote.origin.fetch=+refs/heads/master:refs/remotes/origin/master2
you would get two remote-tracking names from one source. Or, with:
remote.origin.fetch=+refs/heads/master:refs/remotes/origin/master
remote.origin.fetch=+refs/heads/dev:refs/remotes/origin/master
you would get one remote-tracking name from two sources.
This is bad, because it means the mapping is not reversible. If we want to go from origin/master
to "the branch name they use over on origin
", is that master
or dev
? Or, if we want to go from master
-on-origin
to our remote-tracking equivalent, is that master
or master2
?
In some ambiguous cases, Git will just give up and do nothing. Moreover, you can use --prune
, or set the option fetch.prune
to true
, and in this case, after handling:
+refs/heads/*:refs/remotes/origin/*
your Git will comb through any refs/remotes/origin/*
names that you have that weren't created-or-updated-or-at-least-refreshed by this git fetch
operation, and remove them. This doesn't work right without a bijection: the algorithm is basically "do an injection, then remove untouched names if the injection was surjective".
Without --prune
, your Git just leaves these "stale" remote-tracking names behind. That's why there's little, but not no, point to removing remote-tracking names. If you don't use -p
or --prune
or set fetch.prune
to true
, you may accumulate these stale branches. Using git branch -r -d
will allow you to delete them. If you delete some by mistake, a subsequent git fetch
will restore them, assuming a normal fetch
setting.
I just do a git config --global fetch.prune true
to set it as the default, though.
Upvotes: 1