Fabic
Fabic

Reputation: 508

Does every git commit have its own history?

While searching for a bad commit by hand I recognized that the history changes if I checkout a specific commit. Given 2 branches develop and payoff. I merged payoff into develop so the history looks like this:

A - B - - - E - F - G <- develop
     \     /
      C - D <- payoff

G is my current HEAD of develop. C and D are commits in payoff. E is the merge commit. If I do git log I see following commits: G - F - E - D - C - B - A.

Then I checkout F and do git log again. Now I see this: F - B - A. The commits C and D are not shown.

Why is that? (btw: payoff is still available and not deleted if that matters)

EDIT

Here is my git log and my git log --graph --oneline on HEAD.

enter image description here

And here it is after checking out commit Commit B.

enter image description here

Upvotes: 0

Views: 436

Answers (2)

torek
torek

Reputation: 488103

Edit: now that we have a snapshot of part of the graph, we can tell a lot better what is going on. I think you are getting tripped up (at least in part) by the fact that git log sorts the commits it displays, as well as only displaying commits reachable from HEAD (or from whatever starting points you list, if any; see earlier answer below this edit).

The graph is fairly complex and I am too lazy to retype most of it so here's some of it converted to plain text (which loses the color but makes it easy to quote). With any luck I transcribed the abbreviated SHA-1s correctly (they are a pain to retype).

* 4d0be7a Commit A
* 2e2be6d Merge remote-tracking branch 'origin/develop' into develop
|\
| *   97504ea Merge branch 'payoff' into develop
| |\
| | * ede23f0 Commit C
| | * 0e1df38 Commit D
[snip]
| * | 9a20c3c Commit x1

[snip - there is another merge in here]

| | * | 0fdd1ff Commit y3
| * | | eed1783 Commit z1
* | | | 43dcf79 Commit B
* | | | bb2bd73 Commit K

Note the lines, vertical in this case (instead of horizontal in my original text below), and in color in the screenshot, made of vertical bars | connecting commit nodes.

Commit sorting

git log normally sorts commits in time order (according to "committer" time stamp), with the most recent first, so that time stamps get further into the past as you go down the listing (instead of moving further left in a horizontal graph). When adding --graph, git log is forced to use a topological sort instead. This sort may produce different results, especially at merge commits where there are two1 parents.

In a git log topological sort, every child commit must be shown before any of its parents are shown. The topmost merge, 2e2be6d, is the child of two commits, namely 95704ea and 43dcf79

We can see that commit 4d0be7a is HEAD at the time of the git log --graph and it is not itself a merge. Right behind it is 2e2be6d which is a merge, and therefore has two parents. One parent is 97504ea (another merge) and the other is 43dcf79 (your "commit B"). Let's look a bit further at those commits' parents too: the two parents of 97504ea are ede23f0 (Commit C) and 9a20c3c (Commit x1), and just keep those in the back of our minds.

The time stamp on a merge is (normally2) newer than the time stamp on either of its parents. This means that whether or not you use --topo-order to force a topo-sort, or --graph which forces it for you, you will see merge 2e2be6d before both merge 97504ea and 43dcf79 (Commit B), and you will see merge 97504ea before ede23f0 (Commit C) and 9a20c3c (Commit x1). The tricky bit comes about immediately after that.

Without topo order, in what order will you see commits B, C, and x1?

We cannot tell just from this graph (which is sorted in topo order but does not show the time stamps). The only way to tell is to look at a different git log output, or examine the time stamps on the three commits. Fortunately we have some of that information in one of your other screenshots, specifically the first one with the full SHA-1 IDs and date fields. Unfortunately, the date fields shown are the author date fields, not the committer date fields that git uses for sorting. Fortunately, these two are probably the same. Unfortunately we do not see commit x1, but let's just guess that its date is "earlier".

So here we have 4d0be7a (Commit A) at the top, with the newest date, the afternoon of 15 April. Below that we have 2e2be6d (a merge), with a date in the morning of 15 April, and below that we have 43dcf79 (Commit B) with a date stamp after 5 PM of 14 April. (And all the time zones are +0200, probably Europe somewhere.) We cannot see merge 97504ea: it must have a time stamp earlier than that of Commit D.

Hence, when HEAD points to Commit A, git log will sort these commits so that it shows A, then one of the merges-specifically 2e2be6d, then B, then C, and so on. Adding --topo-order, or implying it via --graph, git log changes its sort so that the parents of 2e2be6d are shown earlier, including the second merge.

If, by doing git checkout of some other commit or branch name, we descend into that other part of the graph and run git log, we can no longer move back up through the one-way links to reach merge 2e2be6d, which means we cannot move back down to commit B. This is why B no longer appears in the output.


1Technically a merge is any commit with two or more parents, but you only get more if you make an "octopus merge", which we need not worry about here.

2I say "normally" because computer clocks can be wrong, and because you can tell git to put a different time stamp (past or future) on any commit. For instance if you make a commit with a time stamp in year 2038, it will show up at the top of the listing whenever it shows up at all, unless you choose some other sort order.


I believe your merge commit is commit F, i.e., you could draw this as:

A - B -- E --- F   <-- develop
     \       /
       C - D       <-- payoff

The answer to the subject-line question:

Does every git commit have its own history?

is: "yes, sort of, but this may be the wrong way to ask." That is, don't think of each commit as storing history per se, but rather as keeping track of several metadata items:

  • author (name, email, and date);
  • committer (name, email, and date; often the same as author);
  • a list of parent IDs (bold here because this is the key to this particular question); and
  • your commit log message.

A merge commit is any commit with at least two parents. Assuming that F—the newest commit on develop—is the merge commit, its two parents are E, which used to be the tip of develop before doing the merge, and D, which was and still is the tip of payoff.

Commit E records its (single) parent, as do D, C, B, and A. Note that while B has two children (C and D), commits record only their parents. In order to find siblings or children, git must do some graph reconstruction work—in fact, the same work we did to draw this in the first place.

When you run git log (without extra arguments), git starts with your current commit. It keeps track of which commit is current using the special name HEAD. Typically HEAD actually contains the name of a branch—that is, HEAD refers to develop or payoff, for instance, and the branch-name in turn refers to the specific commit. But when you use git checkout with a specific commit, like E for instance, you get into "detached HEAD" mode, and now HEAD contains the raw SHA-1 ID of the specific commit.

Since git log starts with the current commit, and commits only contain their parents' IDs, if we start at E and work backwards in history, we will only find E, then B, then A, which is what you observed.

You can tell git log to start elsewhere: for instance, git log payoff starts with whichever commit payoff points to (in this case D), or git log develop starts with whichever commit develop points to (in this case, F). Using --all tells git log to start with every commit found from every reference. We haven't defined "reference" yet here, but the short version is that this means all branches, and all tags, and even a few special cases that are not branches or tags, such as the special reference that git stash uses.

Upvotes: 2

CodeWizard
CodeWizard

Reputation: 142064

bad commit by hand I ..

git has a build in mechanism for that.

bisect: https://www.kernel.org/pub/software/scm/git/docs/git-bisect.html

Here is a demo now to use it to find the desired commit:
https://github.com/nirgeier/Tutorials-bisect


To answer your question.

Each commit does have its own reference to the prior commit in the commit chain. A commit is also known as snapshot.

enter image description here


Does every git commit have its own history?

You can see in the image below how does git store the information about all the commits.

Each commit (snapshot) has all the content which was committed.
This is the snapshot.

enter image description here


In your case you have branches which result in several commits. Each branch points to its own commit (snapshot) which can looks like this:

enter image description here

Upvotes: 2

Related Questions