Martin AJ
Martin AJ

Reputation: 6697

Why all commits gone except the last one?

I'm pretty much newbie to git. 5 days ago, I've initialized git on a local folder (named myproject) by this command:

$ git init

Also I've added all folders/files (which exists in myproject folder) to the stage by this command:

$ git add .

And then committed it by this command:

$ git commit -m "my first commit"

After that, I've set a remote to it:

$ git remote add origin https://...

And then pushed everything into my repository on github by this command:

$ git push -u origin master

From 5 days ago until yesterday, I've committed changes 7 more times.

Note: So I had 8 commits until yesterday.


Today, I was trying to earn some new experiences about git's commands, and now surprisingly I'm facing with only one commit (just the last one exists, all other previous commits gone). Honestly I'm not sure what did I do. But I'm suspect to this command:

$ git push -f origin {token}:master

Will it remove all other commits except the last one? If not, do you know what did I do? (I want to know that because I want to avoid doing that again)


EDIT: I'm not sure this is important or not, but when I was playing with commands (to earn experiences ), I also have created 1 branch by checkout.

Upvotes: 0

Views: 100

Answers (1)

torek
torek

Reputation: 488183

Will git push -f remove [some] commits ...?

Potentially, yes. It depends on precisely what you push.

Today, I was trying to earn some new experiences about git's commands, and now surprisingly I'm facing with only one commit (just the last one exists, all other previous commits gone).

First, let's note that Git tries very hard never to give up any commits, so you are probably OK (as long as you have not removed files from within the .git directory itself anyway).

Second, let's take a quick look at how commits work.

In Git, a commit is itself a rather small object. Here's an example commit from the Git repository for Git itself (but with @ replaced with ):

$ git rev-parse HEAD
e05806da9ec4aff8adfed142ab2a2b3b02e33c8c
$ git cat-file -p HEAD | sed 's/@/ /'
tree 14b1822ef3c838411b7b204e03b7272b8ddd63d3
parent af09003b2897db76cefdb08ab363ed68f2bb295b
author Junio C Hamano <gitster pobox.com> 1482859071 -0800
committer Junio C Hamano <gitster pobox.com> 1482859071 -0800

Fourth batch for 2.12

Signed-off-by: Junio C Hamano <gitster pobox.com>

That's an entire commit, namely the one whose ID is e05806d.... However, this commit does not quite stand alone. Note the tree and parent lines. The tree line gives the ID of another Git object, which I won't quote in its entirety, but will show the top of, to give a bit of flavor:

$ git cat-file -p HEAD: | expand | head -5
100644 blob 320e33c327c6f597bcfd255b13876f21b0b2d8aa    .gitattributes
100644 blob 6722f78f9ab7e9647a3358a52e627f1c9e83f685    .gitignore
100644 blob 9cc33e925de8adc562daf9b176135aba7b5e4d9b    .mailmap
100644 blob 3843967a692d1642e43f536d5e2652b566ca554d    .travis.yml
100644 blob 536e55524db72bd2acf175208aef4f3dfc148d42    COPYING

As you can see, this tree refers in turn to blobs, which are Git objects holding the file contents. That's how the commit stores files: it has a tree and the tree has blobs and maybe more trees, and the blobs give the file object IDs, and the names to use in the work-tree when checking out the commit.

The key line to concentrate on here, though, is the parent line. That line gives the ID of another commit.

Git works backwards

Git uses these parent IDs, which all point backwards (from newer commits to older ones), to build what Git calls the "commit graph", or Directed Acyclic Graph or DAG. We can draw this DAG ourselves, replacing each big ugly SHA-1 hash ID with a round o representing the commits:

o <- o <- o   <-- master

This is a very simple, totally linear graph of a repository with only three commits in it. The name master—the branch name—"points to" the latest commit. That is, it has one of those big ugly SHA-1 hash IDs in it. (The name HEAD, incidentally, simply "points to" master most of the time. If you examine .git/HEAD you will see that the file consists of a single line reading: ref: refs/heads/master.)

Git uses the contents of master (directed there by HEAD) to find the hash ID for the most recent commit. Then it uses that commit's parent line to find the previous (second) commit, and that one's parent line to find the previous (first) commit. The very first commit has no parent line at all, because there is no previous commit, so the process stops there, and Git can thus show us all three commits.

Again, it simply starts from HEAD—which says "look at master"—and goes to master, which gives the latest master-commit ID, and then each of those gives you (and Git) an earlier ID.

Your eight commits

With a total of eight commits, we might give them all one-letter names A through H (which are much easier to work with than 40-character non-obvious hashes), and draw them like this:

A--B--C--D--E--F--G--H   <-- master

These are all in your repository, and though I did not draw the internal arrows in this time, H points back to G, which points back to F, and so on to A.

If you make more than one branch, though, you get something that requires drawing more than one line. Suppose, for instance, that you make commits A through E on master, but then run git checkout -b side to create a new branch named side:

A--B--C--D--E   <-- master, HEAD->side

Note that at this point, side points to commit E, just like master. But HEAD points to side instead of to master, and now you make the last three commits on side rather than on master; and this happens:

A--B--C--D--E   <-- master
             \
              F--G--H   <-- HEAD->side

Now side points to H, while master still points to E. If you git checkout master and run git log, Git will start by reading HEAD (which will now point back to master instead of side), and then read commit E, and then D on back to A, and you will see only five commits instead of all eight.

If you run git log side, you will see all eight commits. This is because git log will start not from HEAD, but from side, which points to H. Note that F still points back to E, so logging transitions smoothly from side into master. In fact, commits A through E are on both branches, at this point.

If you now delete the name side, this happens:

A--B--C--D--E   <-- HEAD->master
             \
              F--G--H   [abandoned]

Commits F through H will seem to disappear entirely! But here's a key thing: they're still in there. They are protected through what Git calls a reflog. There is one reflog for each branch name, and one for HEAD. The one for branch side is deleted when the branch is deleted, so it no longer protects the commits; but the one for HEAD remains, and continues to protect the commits (for 30 days by default, after which the reflog entries themselves expire).

Hence, to locate commit H, you can look in the reflogs. Run git reflog HEAD (or just git reflog) to see the reflog for HEAD. How will you know which commit is which, when all they have is big ugly hash IDs? Well, you can git show any promising one by its ID or its reflog name, to see if you can find it. It tends to be a bit difficult, especially if the reflog is full of all kinds of useless leftovers (which is often the case).

You can also use git log -g HEAD, which gives you longer output, and that may help. Or, you can even use git log -g -p HEAD, to see them all as patches.

What about git push?

I mentioned that git push might remove some commits. Let's go back to those graph drawings, specifically the one with all eight commits on master:

A--B--C--D--E--F--G--H   <-- master

That's in your repository. But there is another, separate Git repository elsewhere, specifically over on the machine you're referring to as origin. That Git repository has some set of commits, and some branch name(s) pointing to the tip commit of each branch. Just as your master points to commit H, their master points to some commit.

If you run git push without -f-f is the "force" flag, and can be spelled --force as well—so that you do git push origin master, then your Git will send your commits to their Git,1 and then ask them, politely, if they would please set their master to point to commit H, just as yours does.

If their master points somewhere to the left of H (i.e., to any of the commits in A-B-C-...-G), it's quite safe for them to make their master point to the new H commit, because H points back to G, G points to F, and so on all the way back to A. That means they won't "lose" any commits. (The motion from wherever their master points now, to H, is called a fast-forward operation.) But if they have some other commits that you lack, so that their master points to something else—even if it's a bit different—they will refuse your polite request, because that would lose work. For instance, suppose that instead of A-through-H, they have:

A--B--C--D--E--I

You hand them the F-G-H chain, which they add to their repository:

A--B--C--D--E--I
             \
              F--G--H

and then ask them to set their master to H; and they will say: "no, that is not a fast-forward". They will leave their master pointing to their commit I.2

If, on the other hand, you use a force-push (with -f or --force), your Git sends them, instead of a polite request, a demand: "set your master to what I say!" They do not have to obey, but they generally will. If that means losing commits, well, that loses commits.

You showed:

$ git push -f origin {token}:master

without explaining what you meant by token. The token:master string in git push is called a refspec, and the thing on the left of the colon is the source part of the refspec, with the thing on the right—master, in this case—the destination. If you put a hash ID in here, your Git looks up the hash ID, makes sure it refers to some commit in your repository, and then sends their Git a request-or-command (depending on the force flag as usual) to have them set their destination name to the ID you specified as the source.

If that's a command, and you use force, and the source ID causes them to lose commits, they will probably lose those commits.

Note, however, that this does not make you lose commits. Your master should still point to your commit H. The commands that shift your master name around are git branch -f and git reset (when HEAD points to master, that is: git reset adjusts the current branch, which is the one HEAD points to).


1There's a fairly fancy protocol exchange by which your Git and their Git figure out which commits you have that they don't, and which other objects, if any, they need to get in order to receive everything from you. The details are beyond the scope of this posting though.

2You might wonder what happens to these commits, if the branch is not changed to point to them. Repositories that receive pushes are generally "bare" repositories, which have no reflogs by default, and these unused objects are supposed to be thrown away immediately (and usually are). With no reflogs, if you force-push and "lose" a commit, that commit is also usually thrown away immediately, so beware! The lost commit(s) normally came in from some Git repository somewhere, and if that repository is still intact, those commits are still in it—but you may have to search quite a bit to find that repository.

Upvotes: 1

Related Questions