Will pulling my repo get rid of the work I have done?

Question

In case the title is not clear, here is the scenario.

Today when I started working, I forgot to git pull.

And today I have written a around 200-300 lines, and when I went to git push, it said my local repo was out of date, so I need to git pull.

But will git pull overwrite the code I already written, as they are in the same file?

torek · Accepted Answer

No—but don't use git pull anyway. Not ever, or at least, not yet: not until you are familiar with the two commands that git pull runs for you.

These two commands are git fetch, which is always safe, and the another command. But why do I say "another command" instead of saying which command? That's because the second command git pull runs for you depends on something.

What it should depend on is what you got with git fetch, the first step. Only, it doesn't. It depends on what you tell git pull to run before you know what you got with git fetch.

So: you should run git fetch first, then a second command. The second command is probably git rebase, but let's take it a bit slower.

Git is all about commits

You did a bunch of work today, and you did it by:

editing files in your work-tree (where files have their normal form, so that you can work with them, hence the name work-tree);
using git add: this copies the files to Git's index (also called the "staging area"), replacing whatever was in the index for that file before; and
running git commit: this turns whatever is in the index into a commit.

These commits are in your repository. Your repository is yours: it's private to you. Whatever you do is yours, and, in the words of the old Outer Limits TV show, you control the horizontal, you control the vertical. Each commit is identified by a big ugly hash ID: 8a3fc17... or whatever.

But there's another repository out there, the one you call origin, at some URL. (Insert ominous music :-) )

Making new commits

Each of the commits in your own repository has a bit of history stored in it (the history is the commits, in other words). Each commit says "the commit that came before me was ". This makes chains of commits:

A <- B <- C    <-- master

A branch name like master simply identifies your latest commit, in this case C. That commit C "points back" to the earlier commit B, which points back to A. (In this drawing, the entire repository has just three commits. A was the first one, so it can't "point back" to an earlier commit, so it just doesn't.)

When you add a new commit D, Git makes the new commit point back to whatever was the tip before, and makes the new commit the new branch tip:

A <- B <- C <- D   <-- master

If you made several commits, they all chain together like this:

A--B--C--D--E--F--G   <-- master

(the internal backwards arrows are too annoying and space-consuming to bother with at this point; just keep in mind that in Git, everything is backwards like this).

Every commit, once saved like this, is permanent—well, mostly permanent—and forever unchanging. (It literally can't be changed, because the big ugly hash ID for it is computed by doing a cryptographic checksum over its contents. If you change anything, even a single bit, the new contents get a new, different checksum, so they are a new and different commit.) But it's only in your repository. Eventually, you need to publish—or push—your new commits, so that others can see them. (Or, you can distribute them in other ways, but we'll just worry about push here.)

Note that it's the branch names that find these tip commits. Git otherwise doesn't know where to start. Once Git has a tip commit, it uses the backwards pointers to find the other commits.

`git fetch` gets you new published commits

So, let's say you had:

...--E--F--G   <-- master

in your repository, being up to date with the latest published commits on origin. But then someone published some new commit H in the repository over on origin. You would want to run git fetch.

This has your Git call up their Git on the Internet-phone. They talk a bit and find that their Git has new commit H. Your Git loads it into your repository. New commit H "points back" to existing commit G, and is on their master.

To keep from disturbing your master, your Git saves H, but uses the name origin/master:

...--E--F--G     <-- master
            \
             H   <-- origin/master

This is what git fetch does: it calls up some other Git and finds out what's new, downloads all the new things, and uses these so-called remote-tracking branch names (origin/master) to remember what it just got.

`git push` is like `git fetch`, in the other direction—but not exactly

To publish a commit, you will use git push. This is like fetch, but when your Git calls up the other Git on origin, you give them your commits. Then you ask them to set their branches—not a special will/master, for instance, but just plain master—to point to your new branch tip.

If you're the only one working, that's fine. But maybe another guy, someone named Bob, is also doing work. (Insert ominous music again.)

`git pull` runs a second command

Now, if you haven't made any new commits of your own, you probably want your master to incorporate new commit H. There are two standard commands to do this, and both of them do the same thing at this point, because you haven't made any new commits of your own yet.

These two commands are git merge and git rebase. The git pull command defaults to running git merge as its second command—but in fact, most people should run git rebase.

Right now, it won't make any difference. Your Git will see:

...--E--F--G     <-- master
            \
             H   <-- origin/master

in your repository, and will say: "ah, all I need to do is slide the name master down-and-forward":

...--E--F--G
            \
             H   <-- master, origin/master

(Git calls this a fast-forward.) Now we can also straighten out the kink and just have a linear chain going to H.

What happens when you and someone else both make commits?

Let's get H safely saved away now:

...--F--G--H   <-- master, origin/master

Now let's suppose you forget to update your own repository—or even don't forget, but Bob manages to make a commit and push it while you're working. You, in your repository, make new commit J (we're skipping I to reserve it for Bob):

             J   <-- master
            /
...--F--G--H     <-- origin/master

Note that origin/master in your repository hasn't moved.

Bob, meanwhile, has his own repository, that he also copied (cloned) from origin. He also has this F-G-H sequence. Bob makes a new commit I and runs git push origin. Bob's Git hands Bob's commit I over to origin and asks that they add it to their master, in their repository, over on origin:

...--F--G--H--I   <-- master    (on origin)

Now you run git push. Your Git calls up origin's Git and says "take this shiny new commit J"—it does—and then asks origin's Git to make J the tip of origin's master.

This time, they—origin—say no.

Look what happens to them if they say yes:

...--F--G--H--J   <-- master
            \
             I    [lost! nobody points to I anymore]

So that's why they say "no".

Merge and rebase: here's where the second command matters

Once you have run git push and it has rejected your request, now you have to take action.

You have your commit J, but origin has a commit you don't—Bob's I. You must git fetch it, which gets you commit I:

             J   <-- master
            /
...--F--G--H
            \
             I   <-- origin/master

This is now only in your repository (origin and Bob don't have your J yet—well, origin might, since you gave it to them, but they don't remember it anymore). It's now your job to knit these together.

You can do this with git merge, or with git rebase. And git pull will run one of these. By default, it runs git merge.

The merge command makes yet another new commit, called a "merge commit". A merge commit is a bit special: it points back to both commits:

             J
            / \
...--F--G--H   K   <-- master
            \ /
             I     <-- origin/master

The merge commit combines your work (in J) with Bob's (in I). Now you can git push again: your Git will send, to origin, both J and K, and ask origin's Git to set their master to point to K. Since K points back to I (and also to J), they should now accept the push.

The only problem is that you have added this kind-of-useless merge K. Right now, K records the fact that you and Bob worked at the same time, but Bob beat you to the git push step, and you had to compensate for that. Tomorrow, that probably won't matter. A year from now, it almost certainly won't matter.

The alternative to merging is to rebase. A rebase copies commits to make them land at a new position. Suppose we could copy your original J to a new J' that does what J did—makes the same changes—but makes them after Bob's work:

             J   [no longer needed]
            /
...--F--G--H   J'   <-- master
            \ /
             I   <-- origin/master

This is just what git rebase does. If you had several commits, it would copy all of them, in order, placing each copy after the last commit on origin/master.

Once your as-yet-unpublished commits are all copied like this, you can git push again. This time your Git sends J' to origin's Git. Since J' comes after I, this time, they will take it.

Rebasing is usually better

Now that everyone has J' and everyone has forgotten about the old J commit, the graph looks like this—for both you (in your repository) and origin (in its):

...--F--G--H--I--J'  <-- master

and the little tick mark can even fall off and we will never know (or remember) that the rebase step happened. Bob will run git fetch and update his own master and he will have this same graph as well. There's no merge-commit K and merge bubble in the view: it looks like you just did your work just after Bob did his.

Sometimes, though—especially with big feature commits—it is better to put in a real merge. (Of course, with big features, it's good to develop them over time on a side branch, and then keep the side branch around in case there are small glitches that are easy to see in the side branch, but hard to see in the big merge.) But you may not know, until you have run git fetch, whether someone else brought in a big feature themselves. If they did, rebasing might be substantially more difficult than merging—in which case you may want to merge.

In any case, rebasing is usually the right command. The only real drawback, most of the time, is that you need to make sure you're only rebasing unpublished commits (because when you copy your commits, those hash IDs will be different in the new copies, and Git identifies things by hash IDs). But, this is automatically true here.

So, you should run git fetch—this will get you "Bob's" commit(s)—and then run git rebase, to copy today's commits to come after Bob's. Then you can git push successfully.

Will pulling my repo get rid of the work I have done?

Answers (2)

Git is all about commits

Making new commits

`git fetch` gets you new published commits

`git push` is like `git fetch`, in the other direction—but not exactly

`git pull` runs a second command

What happens when you and someone else both make commits?

Merge and rebase: here's where the second command matters

Rebasing is usually better

Related Questions

Will pulling my repo get rid of the work I have done?

Answers (2)

Git is all about commits

Making new commits

git fetch gets you new published commits

git push is like git fetch, in the other direction—but not exactly

git pull runs a second command

What happens when you and someone else both make commits?

Merge and rebase: here's where the second command matters

Rebasing is usually better

Related Questions

`git fetch` gets you new published commits

`git push` is like `git fetch`, in the other direction—but not exactly

`git pull` runs a second command