user17862362
user17862362

Reputation:

Why should not use git pull?

I am new in Git and I have used git pull origin <my-branch> in most of the time to get the changes from remote repository.

However, as I get some experience, I have observed that git fetch is preferred more, but reading several topics e.g. What is the difference between 'git pull' and 'git fetch'? and Git: Fetch and merge, don’t pull, now I am confused and need a clarification if there is a valid reason to prefer it except from checking changes before getting them.

The general idea behind this, git pull is git fetch + git merge, nut of course there are several drawbacks, etc.

So, could you please clarify me on:

1. How I should update my local branch from remote?

2. As far as I see, the difference between git pull origin <my-branch> and git pull origin, the latter gets all the branches from origin besides <my-branch>. Is that true? And which one should I prefer?

Upvotes: 3

Views: 3255

Answers (6)

Randy Leberknight
Randy Leberknight

Reputation: 1453

I use git fetch if I am fetching a new branch that someone else pushed. Let us suppose that I want to look at Mary's latest changes on her branch, featureB.

I would do:

git checkout featureB
git pull origin featureB

That would fetch the latest meta-data for featureB, and merge it into my current copy of featureB.

But if have never pulled featureB, then I cannot checkout featureB.

If I have currently checked out my own branch, featureA, then I do not want to perform git pull origin featureB because that would fetch featureB, and then merge it into my featureA branch.

Therefore I do this:

git fetch origin featherB
git checkout featureB

Now I have branch featureB in my local repo.

Since I now have branch featureB in my local repo, it is now possible to use checkout the next time I want to update it from the remote repo.

Suppose I had branch featureA checked out, and I want to get the latest version of featureB. I can do this:

git checkout featureB
git pull origin featureB

Upvotes: 0

VonC
VonC

Reputation: 1323263

How I should update my local branch from remote?

I always use git pull (which be default fetch all branches).

But I first set, since Git 2.6

git config --global pull.rebase true
git config --global rebase.autoStash true

That way, after fetching, a simple git pull triggers a rebase of my local commits on top of the fetched remote branch.

The only disadvantage I see is the possible conflicts you might have to resolve multiple times, git pull after git pull. But that is what git rerere would prevent if you are in that case (quite rare).
See "What is git-rerere, and how does it work?"

On the advantages: see "Is it better to use git pull --rebase than git pull --ff-only".

Upvotes: 0

torek
torek

Reputation: 487755

git pull:

  1. runs git fetch; then
  2. (without waiting for you to confirm!) runs a second Git command.

If you want to run two Git commands, and the second command that git pull will run is the command that you want to run second, git pull is fine.

I personally often like to insert some additional Git command(s) between the fetch command and the other command. This is impossible to do when using git pull, because it does not pause for you. So I often avoid git pull. (In particular, I often want to run git log to see what I'm getting into.)

I also find that for those new to Git, they think git pull is sort of magical. By avoiding it—at least initially—in favor of the two separate steps, they learn how to use Git. When using git pull, they don't learn how to use Git. (This applied to me back in 2005 or so too.) So I encourage newbies to use the separate commands. This helps not only with the "Git isn't magic" part, but also with the fact that the second command that git pull runs is something you choose:

  • You can choose to have git pull run git merge.
  • You can choose to have git pull run git rebase.

These two commands both combine work, but the way they do it is very different. If you have done no work in your repository, combine the work I did, i.e., nothing, with the work someone else did results in getting the work that someone else did, so it doesn't matter which command you used. But if you did do some work in your repository, it does matter.

This, too, stands out much better when you separate out the two commands:

  • git merge means merge my work with their work: merge "nothing" with "something" = "something"; merge "something" with "something else" = "some third thing".
  • git rebase means redo my work atop their work: redo "nothing" atop "something" = "something", but redo something atop something else, well, you can probably see where this is going (but if not, read up on git rebase).

To answer your specific questions:

  1. How I should update my local branch from remote?

That depends on which result you want and how sure you are about that second command.

As far as I see, the difference between git pull origin <my-branch> and git pull origin, the latter gets all the branches from origin besides <my-branch>. Is that true?

Mostly. This is where we really need to break git pull down into its two steps, and observe what it gives to each of the two steps.

When you run git pull, you can provide options. For instance, these are both valid ways to invoke git pull:

git pull --rebase

git pull --ff-only

These options are contradictory because --rebase says that git pull should run git rebase as its second command, while --ff-only says that git pull should supply the --ff-only option to the git merge second command, implying that it should run git merge, not git rebase.

So some options control which second command pull should use. Other options are passed to the second command. Still other options are passed to the first, git fetch, command. It's all a bit confusing, and is yet another reason to learn git fetch first.

You can also provide arguments, such as the <my-branch> you suggested here. All non-option arguments you provide are passed to git fetch. Arguments are distinguished from options by the - or -- that goes in front of an option. (Single-dash - options are single letters, such as -j or -4; double-dash -- options are multiple letters, such as --rebase and --show-forced-updates.)

If you provide arguments like origin and <my-branch>, these go through to git fetch, and this affects how git fetch operates. With no arguments, git fetch will:

  • Find the right remote to call up (generally origin): a "remote" is a short name for a way to reach some other Git software that will, in this case, read from another Git repository. In this case you're reaching out to some Git software on GitHub or Bitbucket or GitLab, perhaps, where there's the Git repository from which you made your Git repository earlier. You'd like to reach out to that same Git repository now, and find out if they have any new commits that your Git repository does not yet have. (How did those commits get there? Well, we can worry about that later.)

  • Call up that software and connect to that repository. That repository has its branches and its commits. The branches in that repository are not your branches! They are their branches. They may store different commit hash IDs in them.

  • Figure out which commits they have that you don't, based on the hash IDs stored in their branch names. Decide which commits you want in your repository.

If you don't list some branch name(s) on the git fetch command, your Git assumes you want to update all your copies of all of their branches. So your Git will inspect their master or main, and their develop, and their feature/short or feature/long or feature/tall or whatever. Your Git will figure out if they have any new commits that you don't, and will bring those commits over into your Git repository.

Because commits are numbered with universally unique identifiers, your Git (your software operating on your repository) will now have all of their commits, using the same numbers they are using. Your Git will also have all your own commits that they don't have at all. Now that your Git has all their commits, your Git will create or update all your remote-tracking names: origin/main or origin/master for their main or master, origin/develop for their develop, and so on. Your Git builds these names by sticking the remote name, origin, in front of each of their branch names.

These remote-tracking names constitute your Git's memory of where their branches were, the last time you got hold of their Git. So git fetch with no arguments updates all of them, and since git pull with no arguments calls git fetch with no arguments, you get all your origin/* names updated. With one argument—git pull origin—the same thing happens, you're just now being explicit that you want to work with the remote named origin. If that's the only remote that you have—and that's a typical setup—this does exactly the same thing; any other name here, like git fetch belgium or something, just gives you an error.

But if you run git fetch origin develop, that tells your Git that, for the purpose of this one git fetch operation, you'd like your Git to call up their Git, see all their branches, but limit your updates to any commits needed to update your origin/develop. If they have a new commit on their main, you won't update your origin/main after all. (You'll almost certainly want or have to do that later, so this doesn't really save you much. In fact it might take more time later, vs doing it all at once, due to the way Git optimizes fetching. But it's there if you want it.)

Since git pull passes all the arguments on, git pull origin develop directs your git fetch step to limit itself to their branch named develop. (Again, this becomes your origin/develop.)

But now the second command comes into play. Having run git fetch, with whatever extra options and arguments it might have used, your git pull now runs the second command you chose. (You did choose one, right? 😀 Always make sure you know which second command Git will run here! Most people set one up semi-permanently, so that they know.) This second command is either:

git rebase [options argument(s)]

or:

git merge options argument(s)

Git's pull often passes some options and/or arguments here. In particular, for git merge, it passes:

-m "merge branch '<branch>' of <url>"

to set the merge message, and then it passes the raw hash ID of the tip commit you brought in. For rebase, it may pass --autostash, and it may pass a commit hash ID (or it can let rebase figure out @{upstream} on its own). You don't really need to know all this, but it's worth remembering that git pull does some extra stuff especially for the git merge case, to set the merge message.

There's one last caveat here:

git pull origin br1 br2

is tempting to newbies. Do not use it. It runs git fetch origin br1 br2 and then runs an octopus merge (of HEAD, origin/br1, and origin/br2, in effect) and unless you really know what you're doing, you don't want an octopus merge.

This results in several bottom lines

If you set git pull to always run git rebase, there's very little difference between running your own git fetch followed by your own git rebase, and just running git pull. That's because there's no merge message to alter. Be sure you know what rebase does before you do this, though: rebase is more complicated than merging.

If you set git pull to always run git merge, the fetch-then-merge that pull does has the advantage (?) of setting the merge message to something that might be slightly better than the default you'd get with two separate commands. Compare:

merge branch 'smörgåsbord' of ssh://github.com/swedish/meatballs.git

vs:

merge branch 'origin/smörgåsbord'

Neither one really tells you anything useful, but some might like one better than the other.

Watch out for git pull <remote> <branch1> <branch2>, which is almost certain to do the wrong thing (though if you're set up to rebase, this should just give you an error; rebasing does not make sense with this case).

If you want to run a command between two commands (such as git log) in order to choose which second command to use, you cannot use the do-it-all-at-once leap-before-you-look git pull. This is why—and when—I avoid git pull.

Other than that, they're pretty much the same thing, once you know how git pull just runs two Git commands for you.

Upvotes: 4

Tolis Gerodimos
Tolis Gerodimos

Reputation: 4400

git fetch is the command that tells your local git to retrieve the latest meta-data info from the original (yet doesn’t do any file transferring. It’s more like just checking to see if there are any changes available)

git pull on the other hand does that AND brings (copy) those changes from the remote repository.

The takeaway is to keep in mind that there generally are at least three copies of a project on your workstation.

  1. One copy is your own repository with your own commit history (the already saved one, so to say).
  2. The second copy is your working copy where you are editing and building (not committed yet to your repo).
  3. The third copy is your local “cached” copy of a remote repository (probably the original from where you cloned yours).

You can use git fetch to know the changes done in the remote repo/branch since your last pull. This is useful to allow for checking before doing an actual pull, which could change files in your current branch and working copy (and potentially lose your changes, etc).

git fetch    
git diff ...origin
  1. How I should update my local branch from remote?

Git pull is a safe way to update your branch. There will be times that a conflict will arise from pulling and these will be the edge cases. Worst case scenario you would undo the changes of git pull

  1. As far as I see, the difference between git pull origin "my-branch" and git pull origin, the latter gets all the branches from origin besides . Is that true? And which one should I prefer?

If you use git pull origin without specifying "my-branch", git will fill the value with the branch that you are currently using

Upvotes: 5

sommmen
sommmen

Reputation: 7608

I am new in Git and I have used git pull origin in most of the time to get the changes from remote repository.

They are different commands that do different things, fetch loads all new remote commits. you can then review them and when you know want to actually apply all those new remote commits to your local branch you run a pull.

To be clear running a pull is the equivalent of running a fetch and then a merge merging the remote changes to local.

now I am confused and need a clarification if there is a valid reason to prefer it except from checking changes before getting them.

You don't use the one ore the other - you use them both. There is no preference - you fetch and then you pull - or you just pull.

The preference you're seeing likely comes from the fact that fetch is a safe operation - it changes nothing. Pull does change your local files and you could end up with merge conflicts.

  1. How I should update my local branch from remote?

Running a pull on a local feature branch is fine, but there may be merge commits.

  1. As far as I see, the difference between git pull origin and git pull origin, the latter gets all the branches from origin besides . Is that true? And which one should I prefer?

Not sure about that one.

Upvotes: 1

Mayank Jain
Mayank Jain

Reputation: 466

I am not Writing it based on document but based on some experience. its not much but it might help.

  1. Git pull is the command you should use to get the code in remote to your local.
  2. git fetch only get the meta data not the actual code.

In general with single repo you will never encounter fetch command being used, but if you are working on fork or have multiple remote repo, then only fetch is useful or being used frequently.

Upvotes: 0

Related Questions