Reputation: 131636
Suppose I've git rebase
'd a branch b2
of my repository, amending an older commit (c1
). This commit exists, unamended, on another branch b1
(and there are some more common commits on both branches after c1
, then b2
diverges from b1
).
Now, I want to use to essentially undo my amendment to c1
on b2
. How should I do this, so that the two branches' history becomes maximally identical again?
Upvotes: 1
Views: 30
Reputation: 488253
Use git rebase --onto
. Specify the target with the --onto
argument and specify the commits not to copy with the usual upstream
argument. It's hard to say from here exactly what those arguments should be; see the long discussion below.
This requirement:
... that the two branches' history becomes maximally identical again
means that it becomes important that you know that git rebase
works by copying commits. For a quick recap of ideas let's note that:
The metadata in the commit include the author and committer and log message, but also—key to this whole process—the hash ID of the parent of the commit.
To turn a commit into a change-set, i.e., to find out what someone changed in any given commit, we have Git compare the commit to its parent. The commit stores its parent's hash ID so git show
or git log -p
can find this on its own.
Meanwhile, a branch name like b2
just holds the hash ID of the last commit in the branch. So we can draw them—the commits and the branch names—like this:
... <-c1 <-c2 <-c3 <--b2
where each ci is an actual commit represented by its hash, and an arrow coming out of something means points to: the branch name b2
points to commit c3
, c3
points to c2
, c2
points to c1
, and so on.
Nothing about any commit can ever change so we can draw the internal arrows, from commit to commit, as connecting lines instead of arrows, as long as we remember that they're part of the child and point backwards to the parent. This lets us draw, in crude text graphics, more than one branch. I'll use X1
and up as this is just an example, not exactly related to your starting point:
...--X1 <-- branch1
\
X2--X3 <-- branch2
If branch1
acquires more commits, the newest eventually leads back to X1
:
...--X1--X4--X5 <-- b1
\
X2--X3 <-- b2
Now back to your original setup:
Suppose I've git rebase'd a branch
b2
of my repository, amending an older commit (c1
). This commit exists, unamended, on another branchb1
(and there are some more common commits on both branches afterc1
, thenb2
diverges fromb1
).
I'll draw as accurate a picture as I can of the original setup:
c4--c5 <-- b1
/
...--c0--c1--c2--c3
\
c6--c7 <-- b2
(I'm filling in missing details with guesses, though in the end the guesses should not matter much.)
When you ran git rebase -i <start-point>
while on b2
and amended/edited commit c1
, Git had to copy c1
to some new and different commit. The new commit has an author and log message as usual, which were initially set up to be copied from c1
, but it has a new and different hash ID and maybe a different snapshot and/or different log message (or even different author), depending on exactly what you changed. Let's call the new copy c1'
to tell them apart:
c4--c5 <-- b1
/
...--c0--c1--c2--c3
\
c1' [copying in progress]
Because c1
got copied to c1'
, Git now was forced to copy c2
to a new c2'
:
c4--c5 <-- b1
/
...--c0--c1--c2--c3
\
c1'-c2' [copying in progress]
The difference from c1'
toc2'
is the same as the difference from c2
to c1
. That is, if we compare c2
vs its parent c1
, we'll get some set of changes. If we compare c2'
to c1'
, we'll get the same changes, even if c1
has different contents from c1'
.
Now that c2
is replaced with c2'
, this forces Git to copy c3
to c3'
as well. That forces Git to copy c6
and c7
too. The final copied commit is c7'
and git rebase
finishes by yanking the name b2
over, so that the final result is:
c4--c5 <-- b1
/
...--c0--c1--c2--c3--c6--c7 [old b2, now abandoned]
\
c1'-c2'-c3'-c6'-c7' <-- b2 (HEAD)
Now, I want to use to essentially undo my amendment to c1 on b2. How should I do this, so that the two branches' history becomes maximally identical again?
You may also have added even more commits since then:
c4--c5 <-- b1
/
...--c0--c1--c2--c3--c6--c7 [old b2, now abandoned]
\
c1'-c2'-c3'-c6'-c7'-c8--c9 <-- b2 (HEAD)
What you might like to end up with is:
c4--c5 <-- b1
/
...--c0--c1--c2--c3--c6--c7--c8'-c9' <-- b2 (HEAD)
\
c1'-c2'-c3'-c6'-c7'-c8--c9 [abandoned]
It's probably OK (and usually a lot easier to do) if you end up with:
c4--c5 <-- b1
/
...--c0--c1--c2--c3--c6"-c7"-c8'-c9' <-- b2 (HEAD)
with none of the abandoned commits drawn in.1
To get either of these, you want to:
c8
and c9
(assuming they exist), andc1'
through c3'
and this means using git rebase --onto
, so that you can separate two instructions that git rebase
usually combines.
That is, what git rebase
does is:
Enumerate some list of commits to copy. In the example above, these were c1
through c7
. If you use git rebase -i
, the list goes into a modifiable instruction sheet using the word pick
along with each commit's hash (shortened) and the subject line from the commit log message.
Normally, merge commits—those with two or more parents—are ejected from the list right away. (There are some modes that don't reject them; these are more complicated and we'll ignore them here.)
Which commits get listed? That's from your upstream
argument: the commits to be copied are those that are reachable from HEAD
by walking backwards, commit to commit: c7
leads back to c6
, which jumps back to c3
, which goes back to c2
, and so on. But, from this list—which could go back a really long way—we remove any commits reachable from the upstream
argument. So if upstream
is the hash ID of c0
, we'll take c0
away from the list, and also any commit before c0
. That means the list starts with c1
and ends with c7
, skipping the unreachable-this-way c4
and c5
.
Pick a target for --onto
. If you use --onto
, you choose this directly. If not, you choose it with your upstream
argument. For instance, with git rebase master
, the upstream is master
and the --onto
target is the commit to which the name master
points. Git does a git checkout --detach
here (or the internal equivalent) so as to get off the branch with the commits you're copying.
Begin copying commits, as if by git cherry-pick
, one at a time. Some rebase operations literally use git cherry-pick
and some don't, quite.
When the copying is finished, move the original branch name so that it points to HEAD
, which is the last copied commit, or—if we didn't copy any commits after all—the --onto
target. Then get back on that branch, as if by git checkout name
.
Using --onto
lets you change the upstream
argument without also setting the target of the rebase.
So, if you want to copy just c8
and c9
, you can tell by inspection that the --onto
target is c7
, and that the first commit you don't want copied is c7
. This wouldn't require git rebase --onto
after all. If you have the hash ID of c7
available, you could, for instance, run:
git rebase <hash-of-c7>
while on branch b2
. But to find the original c7
, before the first rebase, you'll have to rummage through the reflogs. This can be difficult, as the reflogs tend to contain a lot of motion, and once you've copied one commit once, you may have copied it many times.2
So we can instead just let Git copy c6'
and c7'
again. We'll set the upstream
to c3'
as the first commit not to copy, and set c3
as the --onto
target:
git rebase --onto <hash-of-c3> <hash-of-c3'>
for instance. Git will enumerate the commits by walking back from HEAD
(c9
) until it reaches c3'
, which you said not to copy (nor anything earlier). That will list c6'
, c7'
, c8
, and c9
as the commits to copy. The copies are to go after c3
(--onto
). Note that commits c3
and c3'
are both easily visible in the history and crude ASCII graph that Git draws that you can view with:
git log --graph --oneline b1 b2
so this gives you your hash IDs for the --onto
and upstream arguments.
1c9
and its abandoned history are all still there, in your Git repository, if you want them back later. They can be found via Git's reflogs. The reflog entries only last for some period of time. After 1 to 3 months by default, the reflog entries expire, and are deleted. Once that happens, the abandoned commits themselves can also be deleted for real, after which you can't get them back, at least not through your own Git.
(The details of reflogs and reflog expiration are a little complicated but not all that relevant here.)
2To view the reflog for branch b2
, run git reflog b2
. If you're lucky, there's not a lot of copies and not a lot of random motion and you can find c7
this way.
Upvotes: 2