Reputation: 3973
I have a file in my local branch and I want to be able to rebase origin/main
while making sure that after the rebase this file in my local branch will be the exact same as it is right now.
Is there a way to do a rebase and guarantee that? Even better if during the rebase I don't have to answer any questions or resolve any conflicts for this file.
Upvotes: 1
Views: 1714
Reputation: 488213
Use a temporary tag to mark a commit that has the desired copy of the file. Then, use git rebase -i
and insert x
commands to run a short script after each pick
. You have a choice of what, precisely, to put in this script, but this (untested) might be what you want:
#! /bin/sh
git checkout temp-tag -- path
git diff-index --quiet HEAD || git commit --amend --no-edit
Once this is all done, remove the temporary tag (and the script; it's not like it was difficult to write, and it has the tag and path hardcoded).
To make sense of this answer, start by memorizing this fact: in Git, files aren't really in branches. Files are really in commits.
Commits are contained in branches—or in other words, found by using branch names, then working from commit to commit, backwards, through the links that Git stores in each commit. So you can go from branch name to commit and thence to file. But that "to commit" step is critical, because each commit has a full snapshot of every file.
Next, let's look at what git rebase
does and how it does it. Remember that Git is all about commits, and each commit has a unique hash ID. No part of any existing commit can ever be changed. So, since rebase literally can't change any of the existing commits, it necessarily has to work by copying the old (and lousy, or at least inadequate in some way) commits to new-and-improved commits. These new-and-improved commits are the same as the old commits in some way, and different in some way.
Each commit, as found by its unique hash ID, has two parts:
There's the main data of a commit: the source code snapshot that goes with this commit. These aren't changes. The snapshot has each file exactly as it should appear if that one particular commit is checked out later.
Besides the data, each commit has some metadata, or information about the commit itself: who made it (name and email address), when (date and time stamp), and so on.
The metadata separate the "who made this commit" into two parts: the author is the name, email, and timestamp from whoever made the commit originally, and the committer is the name, email, and timestamp of the person who made this variant of the commit. So when we copy an old commit like this, we retain the original author, but set up a new committer. If you're copying your own commits, this means that the name-and-email doesn't really change—the old one had you as both, and the new one has you as both—but the committer time-stamps do change.
Most importantly, though, each commit records the hash ID of its previous or parent commit. The point of rebasing is typically to take a string of commits like this:
I--J--K <-- feature
/
...--G--H--L <-- mainline
and make new-and-improved versions of commits I
, J
, and K
, so that the new commits descend from L
rather than from H
:
I--J--K <-- feature
/
...--G--H--L <-- mainline
\
I'-J'-K' <-- new-and-improved-feature
where commit I'
is a "copy" (sort of) of commit I
, J'
is a copy of J
, and K'
is a copy of K
.
Without worrying too much about the mechanics of the copying process—though I'll mention here that it uses git cherry-pick
—let's make one last observation, which is that the way we (and Git) find commits is to use the branch name to find the last commit in the chain. When commit H
was the last commit of mainline
, we found it because we had:
...--G--H <-- mainline
The name mainline
held the hash ID of commit H
. So git checkout mainline
would extract commit H
for us to use or work on/with. But then we, or someone, made a new commit that added on to mainline
, which we are calling commit L
, so that we have:
...--G--H--L <-- mainline
The name mainline
now holds the hash ID of commit L
. A git checkout mainline
command will extract commit L
for us to use. To even find commit H
, we have to have Git open up commit L
and read its metadata. This metadata contains the raw hash ID of earlier commit H
.
What this means for us is that once we have accomplished this:
I--J--K <-- feature
/
...--G--H--L <-- mainline
\
I'-J'-K' <-- new-and-improved-feature
we can take the name feature
off commit K
and paste it onto commit K'
instead, like this:
I--J--K ???
/
...--G--H--L <-- mainline
\
I'-J'-K' <-- feature
Now, when we try to see what commits are on branch feature
, we'll have Git start by using the name feature
to locate commit K'
. Commit K'
points back to earlier commit J'
, which points back to I',
which points back to L
. Our rebase will be complete once we move the branch name, and toss out any funky special name that we might have been using while building the I'-J'-K'
sequence.
(Exercise: What happens to commits I-J-K
? Does it matter? How would we even know if they're still in the repository?)
git rebase
worksI mentioned above, rather briefly, that git rebase
uses git cherry-pick
to copy each commit. The cherry-pick command, in turn, works by ... well, technically it's a full-blown three-way merge, but it's easier to see it, at first, by looking at what happens when we compare just two commits.
Let's start with this, our "before" picture:
I--J--K <-- feature
/
...--G--H--L <-- mainline
We need to have Git check out commit L
, which is where we want to have the new commits go. If we were doing this the normal way, we'd make a new branch name such as tmp
, using:
git checkout -b tmp <hash-of-L>
(or the same with the git switch
command in Git 2.23 or later). Git actually uses what it calls detached HEAD mode for this, with the special name HEAD
pointing directly to a commit:
git checkout <hash-of-L>
or:
git switch --detach <hash-of-L>
which produces this:
I--J--K <-- feature
/
...--G--H--L <-- HEAD, mainline
Now Git runs git cherry-pick hash-of-I
. Git saved the hash IDs of commits I
, J
, and K
during the whole setup process. If you use git rebase --interactive
here, you'll see pick
commands that list these hash IDs.1 The pick
represents a cherry-pick command.
The cherry-pick itself winds up comparing the saved snapshot in commit H
against the saved snapshot in commit I
. The difference between these two snapshots is, in effect, a set of instructions that can be applied to a snapshot as well. Applying that set of instructions to the snapshot in H
produces the snapshot in I
. But what if we apply these instructions to the snapshot in L
?
If we do just that—and assuming it works and has no merge conflicts2—and make a new commit from the result, we'll get commit I'
. We will have Git save the original author information and the original commit message as-is, and generate a new set of committer information and use the snapshot we got by applying the diff. The result is:
I--J--K <-- feature
/
...--G--H--L <-- mainline
\
I' <-- HEAD
Git now goes on to do a git cherry-pick hash-of-J
, to copy commit J
by comparing I
-vs-J
and applying this to I'
:
I--J--K <-- feature
/
...--G--H--L <-- mainline
\
I'-J' <-- HEAD
Finally—since there are only three commits—we do our last cherry-pick of commit K
, which compares J
-vs-K
(and J
-vs-J'
if you are interested in the merge aspect of cherry-pick) to build commit K'
, which leaves us with this:
I--J--K <-- feature
/
...--G--H--L <-- mainline
\
I'-J'-K' <-- HEAD
and the only task left is to move the name feature
to point to the current commit K'
to get:
I--J--K ???
/
...--G--H--L <-- mainline
\
I'-J'-K' <-- feature (HEAD)
This completes the rebase process.
1The instruction sheet for git rebase
, that you get to edit, has the hash IDs abbreviated. I've never been quite sure why: Git has to expand them back out to use them internally. Maybe the Git folks just think they look less intimidating when there are 7 or 12 random-looking characters instead of 40. For git describe
output, where this might go in someone's email or something, sure—but here, they're just instructions on a temporary page, and if you edit them you can use "move line" instructions in your editor.
2Merge conflicts, if any, arise from comparing the snapshot in H
vs the snapshot in L
as well. That's the case for the first cherry-pick, at least. The two subsequent cherry-picks use commits I
and J
as the merge bases, with the --ours
commits being the commit built in the previous step. This is where it all gets a little tricky.
I believe what you want is that, after each cherry-pick, you'd like some particular file in the new (copied) exactly match some particular file in some particular earlier commit.
Let's assume that existing commit K
has the desired version of the file. What we'll do—to avoid depending on Git not moving the name feature
, and to let you pick any commit—is to create a temporary lightweight tag identifying this commit:
git tag temp-tag <hash-of-K-or-whatever>
Note: if there is not a single fixed version of the file that should go into every copied commit, you'll want a different strategy for locating the source commit for the checkout
, but the rest can continue to work.
Next, we'll use git rebase -i
. This turns the set of cherry-picks into an editable instruction sheet. Using our editor, after each pick
command, we add a line using the exec
or x
command:
pick <hash>
x /tmp/script
(assuming our little script has been put in /tmp/script
and made executable).
Git will execute the cherry-pick command, all the way to its completion, which involves making the new commit (I'
, J'
, or K'
in our example). Then it will run the script because of this x
line. The script:
Extracts a particular file from a particular commit: using temp-tag
, we get the desired file from the desired commit, placing it into both Git's index and the working tree. (The index copy is the one that matters, but it's good to update the working tree too, for sanity's sake if nothing else.)
Tests to see if the result merits replacing the tip commit (git commit --amend
). This is our git diff-index --quiet HEAD
. If the index still matches the current commit, there's nothing to change. Otherwise, we'll run git commit --amend
, which shoves the current commit out of the way and makes a new one. Using --no-edit
, we tell git commit
to simply re-use the existing commit message.
Note: In this case, even if there are no changes, git commit --amend --no-edit
is actually safe, but it's wasted effort. For this script and task, that's probably not really relevant, but it seems good not to perform a lot of unnecessary work.
So, this will make sure that each replacement commit is itself replaced during the rebase, with a "corrected" replacement with the single file swapped out to the one we want. That way, by the time Git gets around to yanking the branch name off the old branch and putting it onto the end of the replacement commits, each of the replacement commits is the actual desired new-and-improved commit.
Aside from cleaning up (removing the lightweight temp-tag
tag and removing the script), nothing else needs to be done.
Upvotes: 4
Reputation: 3973
One workaround would be for me to copy-paste the file to a scratch pad, run the rebase with -Xours
and then paste over the end result from my scratch pad.
I don't really like that solution (and it doesn't generalise if we're talking about more than one file under conflict) but it seems like that's the quickest way forward.
Upvotes: 0