ninja
ninja

Reputation: 21

Pull All the Remote Branch to your fork

So, I cloned one repository to my fork a few months ago. Now, that repo has some new branches which I want in my fork. What should I do? Please help me.

I am giving the example below for a better understanding.

Say I have repo "upstream/facebook" and I have forked it. After few months when I open "upstream/facebook" it shows below branches:

master
stage_demo
stage_test
stage_help
stage_2020
stage_2021
stage_2019

Branches in my fork:
master
stage_help
stage_2020
stage_2021
stage_2019

So branches stage_demo and stage_test were added which I want in my fork. What should I do? Thanks in advance.

Upvotes: 1

Views: 3013

Answers (1)

torek
torek

Reputation: 489888

There is an asymmetry between git fetch and git push here that is tripping you up. (The GitHub fork aspect of this is not really relevant: you'd get the same thing with any ordinary set of clones.)

What you'll want to do is simple enough:

  1. Run git fetch against the repository that has the various branch names. This results in one remote-tracking name, in your own personal repository, for each branch name in the repository you're fetching from.

    In your case, you're calling this repository upstream, so you would run git fetch upstream. This will create or update the various upstream/* names. You may wish to use git fetch --prune here, in case they've deleted some branch names, so that your own Git will delete the corresponding remote-tracking names: e.g., if they (upstream) used to have a branch named gronk, and they no longer have a branch named gronk, but your Git created upstream/gronk yesterday when they did have their gronk, this git fetch --prune upstream will remove upstream/gronk.

    (You can, if you prefer—as I do—just set fetch.prune in your personal global Git configuration, so that all git fetch operations act as if you had used git fetch --prune. This saves typing it in each time.)

  2. For each remote-tracking name that now appears in your own repository for which you want a branch name to appear in your GitHub fork, use git push to create that branch name in your GitHub fork, pointing to the same commit that is now found by that remote-tracking name in your local repository.

    For instance, you now see upstream/stage_demo in your repository (locally on your computer). This is a remote-tracking name. It was created in step 1 above. You wish to see the branch name stage_demo in your GitHub fork. You just need to ask GitHub to use their Git to set your GitHub fork's stage_demo name—this will be a new branch name for them—based on your own upstream/stage_demo name.

    To be fully explicit, nailing everything down so that nothing can go wrong, you would need:

    git push origin refs/remotes/upstream/stage_demo:refs/heads/stage_demo
    

    (assuming you call your GitHub repository origin). However, you can almost certainly get away with the slightly shorter:

    git push origin upstream/stage_demo:refs/heads/stage_demo
    

Note that this does not create a local branch named stage_demo. If you wish to do that, you can do that instead of step 2 above with a simple:

git checkout stage_demo

or:

git checkout -t upstream/stage_demo

(or the equivalents with git switch, in Git 2.23 or later). Once you've done that, you can use the simpler:

git push origin stage_demo

to have your Git send the commits to your GitHub fork and request that GitHub create this branch name in your GitHub fork.

More about this asymmetry

What's going on here is simple enough. Both git fetch and git push work by having your Git call up some other Git. Your Git reads from (git push) or writes to (git fetch) your Git repository; the data transfer direction here is determined by which command you ran. The asymmetry is that, having gotten or sent some set of commits (possibly empty), the transfer finishes by:

  • git fetch: creating or updating remote-tracking names in your repository;
  • git push: creating or updating branch names in their repository.

The remote-tracking names are the names that look like origin/master and upstream/stage_demo. They are built out of two pieces: a remote, in this case either origin or upstream, and a branch name, in this case either master or stage_demo. The remote is the one you gave to your git fetch. The branch name is the one found in their Git.

Your git fetch does not write on your branch names. That's because your branch names are yours. They find the commits you care about. Their branch names find the commits they care about. You and they only care about the same commits if and when you decide that this is the case. At other times, you and they care specifically about different commits, and your Git should not forget your commits in favor of theirs just because you picked up their commits. This is especially important when you have made new commits: they won't remember them, so if your Git updated your branch names, your Git would forget your new commits. This would be, well, Very Bad.

But git push doesn't create or update any sort of "push-tracking names". (Perhaps it should—or should be able to—given the way we use GitHub, but it doesn't, because they just don't exist.) The push creates or updates their branch names. This implies that they can't create new commits—which in many cases is true because you're pushing to your fork, not to your upstream. Nobody but you can make new commits in your fork, and you normally do that by making them locally, then sending them with git push.

When you are merely using your own GitHub fork as a second / backup storage system for your own commits made in your own repo on your own computer, this all works fine. But here, you're using your own GitHub fork to store your commits and to store their commits from what you're calling your upstream. So now you have to do this triangular workflow:

  • get their commits from your upstream to your computer; then
  • send their commits that are now on your computer to your GitHub repo

and this requires a two-or-more-step dance.

There is a case that lacks this asymmetry

If you don't like all of this, there is a way around it. Git has something called a mirror clone. A mirror clone doesn't use remote-tracking names. When you run git fetch on such a clone, that updates the clone's branch names, so that the clone acts as a mirror of some other clone.

The problem with this is that a mirror clone can't receive new work by any mechanism other than just getting it from the place you cloned it. That's because, having hitched all its own branch names to the other repository, any new commits you stick in it will be lost right away by any git fetch unless those new commits are already in the clone from which you're fetching. So any new commits you make, you must first send directly to origin, before bringing them back into the mirror clone.

Mirror clones, in other words, are not much good here. They're mainly useful to act as local caches. Suppose, for instance, your company has offices in London, New York, San Francisco, and Singapore. Each location has a hundred or more workers who will clone and fetch and merge and so on: a lot of reading. Instead of everyone fetching directly from London every time, you can have mirror clones in NY, SF, and Singapore, update the mirror clones every 15 minutes, and have each office fetch from the local mirror, which is never any more than 15 minutes out of date.

Upvotes: 2

Related Questions