Reputation: 4826
When rewriting the history by git filter-branch --tag-name-filter cat …
either by using --prune-empty
and/or --subdirectory-filter=…
you'll get into the case, that the commits that where tagged are removed.
That's reasonable so far and works as designed.
the question / goal
What I now want to archive is: preserve the tags on a nearby rewritten commit
example:
starting from A -> B(tag: foo) -> C -> D -> E
(where E is newer than D newer than C …)
running git filter-branch
I get either
get A' -> B'(tag: foo)' -> E
( ^ the good case )
or: A' -> D' -> E'
( ^ the bad case )
What I'm trying to get then is: A'(tag: foo)' -> D' -> E'
since A'
represents what has been tagged in B
some research:
first thing I stumpled over was git cherry
in Git: Is there a way to figure out where a commit was cherry-pick'ed from? but this not seems to help very much to find the differences sind the trees are disjunct.
Instead, I already found a useful example of --commit-filter
https://stackoverflow.com/a/14783391/529977 to write a log of the rewritten objects
some ideas:
With that --commit-filter
"mapping file" in mind, I would theoretically be able to
git log --oneline -1 ${tag}
other ideas I had were:
git log -1 --format="%an%ae%at%cn%ce%ct%s" | sha1sum
in the original tree, then traverse history down to the next known tag but this comes close to the idea abovesounds still a hard way, even I don't have a good idea to solve these steps ... any other ideas or known solutions (?!) welcome!
Upvotes: 7
Views: 1600
Reputation: 31
I found using git_commit_non_empty_tree
unreliable. Another approach, which is relatively simple, is to re-apply the tags to the first occurrence of the tree hash. This is not the `correct' answer in the presence of back-outs, but might actually be desirable, depending on your use case.
for tag in $(git tag)
do
t=$(git rev-parse $tag^{tree})
r=$(git log --format='%T %H' | grep "^$t" | tail -n 1 | sed -e 's/.* //')
git tag -f $tag $r
done
The git log
can obviously be cached. This needs to be done after a filter-branch
without --prune-empty
and then run
git filter-branch --prune-empty --tag-name-filter cat -- --all
to remove the empty commits. This only works for lightweight tags, but if you're using filtering you probably want to convert annotated tags to lightweight ones first and then reapply them at the end.
Upvotes: 2
Reputation: 1851
Deleted: * * * * * *
Tags: R S T U V W
Commits: A -> B -> C -> D -> E -> F -> G -> H -> I -> J -> K -> L -> M -> N
Expected output:
Tags: R T V W
Commits: A -> B -> E -> G -> H -> I -> L -> N
We will be testing this with --prune-empty
so we are creating empty commits for the commits which should be deleted. Let's setup the test repository.
git init
touch n && git add n && git commit -m "N"
git commit --allow-empty -m "M"
touch l && git add l && git commit -m "L"
git commit --allow-empty -m "K"
git commit --allow-empty -m "J"
touch i && git add i && git commit -m "I"
touch h && git add h && git commit -m "H"
touch g && git add g && git commit -m "G"
git commit --allow-empty -m "F"
touch e && git add e && git commit -m "E"
git commit --allow-empty -m "D"
git commit --allow-empty -m "C"
touch b && git add b && git commit -m "B"
touch a && git add a && git commit -m "A"
git tag W $(git log --pretty=oneline --grep=M | cut -d " " -f1)
git tag V $(git log --pretty=oneline --grep=K | cut -d " " -f1)
git tag U $(git log --pretty=oneline --grep=F | cut -d " " -f1)
git tag T $(git log --pretty=oneline --grep=E | cut -d " " -f1)
git tag S $(git log --pretty=oneline --grep=D | cut -d " " -f1)
git tag R $(git log --pretty=oneline --grep=C | cut -d " " -f1)
To begin with we are going to create a file containing all the tag names and the commit hashes they point to.
for i in $(git tag); do echo $i; git log -1 --pretty=oneline $i | cut -d " " -f1; done > ../tags
When running git filter-branch
the commit hashes will change. To keep track of those changes we create a file with mappings from the old commit hashes to the new commit hashes. The trick to do that is shown here.
The --subdirectory-filter=...
command would then look like this:
git filter-branch --subdirectory-filter=... --commit-filter 'echo -n "${GIT_COMMIT}," >>/tmp/commap; git commit-tree "$@" | tee -a /tmp/commap'
Since the --prune-empty
option conflicts with the --commit-filter
we need to change something. The documentation of --prune-empty
helps here:
Some filters will generate empty commits that leave the tree untouched. This option instructs git-filter-branch to remove such commits if they have exactly one or zero non-pruned parents; merge commits will therefore remain intact. This option cannot be used together with
--commit-filter
, though the same effect can be achieved by using the providedgit_commit_non_empty_tree
function in a commit filter.
So the --prune-empty
command which we will be using for this test looks like this. Make sure that /tmp/commap
doesn't exist or is empty before you run the command.
git filter-branch --commit-filter 'echo -n "${GIT_COMMIT}," >>/tmp/commap; git_commit_non_empty_tree "$@" | tee -a /tmp/commap'
mv /tmp/commap ../commap
Now we ran git filter-branch
and gathered all the information needed to deal with the tags. We will have to delete tags and we will have to change the commit tags point to. We are lucky here, git stores the commit hash a tag points to simply in .git/refs/tags/TAGNAME
.
Now what's left is to write a script to automatically correct the tags. Here is what I wrote in Python.
def delete(tagname):
print('git tag -d {}'.format(tagname))
def move(tagname, tagref):
print('echo "{}" > .git/refs/tags/{}'.format(tagref, tagname))
tags = {}
with open('tags') as tagsfile:
for i, line in enumerate(tagsfile):
if i%2 == 0:
tagname = line[:-1]
else:
# if there are multiple tags on one commit
# we discard all but one
tagref = line[:-1]
if tagref in tags:
delete(tags[tagref])
tags[tagref] = tagname
commap = []
with open('commap') as commapfile:
for line in commapfile:
old, new = line[:-1].split(',')
commap.append((old, new))
lastnew = None
takentag = None
for old, new in commap:
if old in tags:
if takentag:
delete(takentag)
takentag = tags[old]
if new != lastnew:
# commit was not deleted
if takentag:
move(takentag, new)
takentag = None
lastnew = new
The script output the commands needed to adjust the tags. In our example this is the output:
echo "0593fe3aa7a50d41602697f51f800d34b9887ba3" > .git/refs/tags/W
echo "93e65edf18ec8e33e5cc048e87f8a9c5270dd095" > .git/refs/tags/V
git tag -d U
echo "41d9e45de069df2c8f2fdf9ba1d2a8b3801e49b2" > .git/refs/tags/T
git tag -d S
echo "a0c4c919f841295cfdb536fcf8f7d50227e8f062" > .git/refs/tags/R
After pasting the commands to the console the git repository looks as expected:
$ git log
8945e933c1d8841ffee9e0bca1af1fce84c2977d A
a0c4c919f841295cfdb536fcf8f7d50227e8f062 B
41d9e45de069df2c8f2fdf9ba1d2a8b3801e49b2 E
6af1365157d705bff79e8c024df544fcd24371bb G
108ddf9f5f0a8c8d1e17042422fdffeb147361f2 H
93e65edf18ec8e33e5cc048e87f8a9c5270dd095 I
0593fe3aa7a50d41602697f51f800d34b9887ba3 L
5200d5046bc92f4dbe2aee4d24637655f2af5d62 N
$ git tag
R
T
V
W
$ git log -1 --pretty=oneline R
a0c4c919f841295cfdb536fcf8f7d50227e8f062 B
$ git log -1 --pretty=oneline T
41d9e45de069df2c8f2fdf9ba1d2a8b3801e49b2 E
$ git log -1 --pretty=oneline V
93e65edf18ec8e33e5cc048e87f8a9c5270dd095 I
$ git log -1 --pretty=oneline W
0593fe3aa7a50d41602697f51f800d34b9887ba3 L
Upvotes: 2