Zebra Propulsion Lab
Zebra Propulsion Lab

Reputation: 1746

How do I checkout the snapshot of the master branch from git at a given date/time?

If a commit was made in other branches and merged into master after the given date/time, the changes in that commit should not appear in the snapshot, even though the commit was made before the given date/time

Option 2 in this answer doesn't do the job I need, as it simply sorts the commits chronologically and return the latest revision before the given date/time.

Upvotes: 2

Views: 1905

Answers (1)

torek
torek

Reputation: 488619

You can still start with that answer (which I'll repeat here, slightly modified, since I'm about to pick it apart):

git checkout `git rev-list -n 1 --before="yesterday" master`

The problem you want to avoid is that git rev-list finds all commits that are reachable from the tip of branch master, including other branches that were merged into master, e.g.:

...-A--C---E   <-- master
   /      /
...--B---D     <-- develop

Let's say that the commit you would like to find is the one labeled C, but D is later than C (while still being old enough to be picked up by the --before option). The git rev-list finds all commits reachable from E (the tip of branch master) whose date-stamp1 is early enough. The -n 1 causes it to stop as soon as it finds one: without this, it would list first commit D, then commit C, then even more commits (B, then A, etc).

However, there are more options you can pass to git rev-list. For instance, --topo-order causes the commits to come out in a "topologically sorted" order: you'd get D, then B, then C, then A. Not quite what we wanted either, but it shows that git rev-list is a complicated little beastie. :-)

What we really need is a way to tell git rev-list that it should not go down the path of commits that were merged into master from other branches. It turns out there are two ways to do this, with two different flaws.

using --first-parent

The first and simplest way is to use --first-parent (simply add this to the arguments to git rev-list). In this case, whenever git rev-list hits a fork while working backwards—e.g., from E to both C and D—this instructs it to follow only the first parent. When this works, it works because every git merge commit specifically records the current branch as the first parent, and any merged-in commits/branches as second parents (or even third, fourth, etc., parents, in the case of an "octopus merge").

This fails when there's a merge commit on your branch caused by someone using, e.g., git pull to merge their work into a branch line. Remember that git pull is just git fetch followed by git merge, so someone working on branch branch makes their private commit K, someone else also working on branch makes their private commit L, the person who made L pushes first, and the person who made K pushes but gets "not a fast forward" so pulls and then pushes. Their pull merges their K with commit L:

...-o--K---M   <-- branch
     \    /
      --L-     <-- origin/branch

and then they push successfully so that origin/branch now points to merge commit M:

...-o--K---M   <-- branch, origin/branch
     \    /
      --L-

The first parent of merge M is K; L, which was made on branch by someone else, is the second parent. The --first-parent option to git rev-list will skip over commit L because it was merged in (by whoever made and pushed K).

If you have a strict policy of rebasing instead of merging, this situation does not arise in the first place, so you can ignore it. If not, you can either live with the flaw (of skipping these commits) or try the second option.

using --not ...

Another way to exclude branches like develop is, well, to exclude branches like develop. You want all commits that are reachable from master, but are not reachable by starting from the tip of any other branch.

First, you need a list of all branches, or at least all branches to discard. Let's get a complete list of all branches (and only branches), except for master itself:

git for-each-ref --format='%(refname:short)' refs/heads | grep -v '^master$'

Next, you simply ask git for all commits reachable from master, excluding all commits reachable from the above branch list:

git rev-list --before=... -n 1 master --not $(git for-each-ref \
  --format='%(refname:short)' refs/heads | grep -v '^master$')

This time, when git hits commit D, it's excluded, because it's on branch develop. (It's also on branch master: a commit can be on many branches simultaneously.) Commit C, however, is only on branch master, so when it qualifies for the --before restriction, it is listed, and then the -n 1 makes git rev-list stop (so that you don't also get commit A).

The flaw with this method occurs if someone deletes branch develop. This leaves the actual revision graph unchanged, but the develop label no longer exists to point to commit D. Thus, there is no reason that git rev-list can see to exclude commit D, so that's the revision you will see.


Here's an exercise you should try. Suppose I give you only this graph of commits (having peeled off some or most of the labels):

...--o--o--o---o--o   <-- branch
      \       /
       o--o--o

How did I make this graph? Did I check out a side branch, feature, and make some commits there, then check out branch and (perhaps under another user name) make some commits there too, and then eventually git merge feature into branch but then delete feature? Or, were there two different people working on branch, and one made some commits and pushed first, while the other made some commits but pushed second and got an error from push? If the second person then did a pull and thus merged and pushed, would that produce the same graph?


1Based on the source, it appears to look only at the committer date, not the author date. I could not find this documented anywhere (I've been wondering about this, off and on, for quite a while).

Upvotes: 2

Related Questions