Qiulang
Qiulang

Reputation: 12395

git subtree pull --squash bug if the pulled codes rebase before?

I found git subtree pull seems to have a bug with rebase. But when I google it I can't the related information (in fact there isn't much information about git subtree).

Let me first show my subtree setup, I have 3 subtrees added as 3 sub-directories:

ws ➤ ls -d */                                                                                                                            
ccfront/  wsclient/ wsserver/
ws ➤ git remote -v                                                                                                                       
ccfront ssh://git@xxxxx/webcc/cc-frontend.git (fetch)
ccfront ssh://git@xxxxx/webcc/cc-frontend.git (push)
origin  ssh://git@xxxxx/webcc/ws_all.git (fetch)
origin  ssh://git@xxxxx/webcc/ws_all.git (push)
wsclient    ssh://git@xxxxx/webcc/bs-front.git (fetch)
wsclient    ssh://git@xxxxx/webcc/bs-front.git (push)
wsserver    ssh://git@xxxxx/webcc/ws_redis.git (fetch)
wsserver    ssh://git@xxxxx/webcc/ws_redis.git (push)

Normally git subtree pull works fine, but I find that if I rebase the result code in "parent” repository and later run git subtree pull again to get the latest code from subtree, git subtree pull works unexpectedly. Following is an example,

//The result should only pull the latest code as the committed message shows 
//Squashed 'wsserver/' changes from 5e997710..1cc96493

git subtree pull --prefix=wsserver wsserver v4 --squash -m 'test'

git subtree pull

The pulled codes should be the change between 5e997710..1cc96493 in wsserver as such, only 4 files changed

git diff --name-only 5e997710 1cc96493
"doc/\347\242\260\345\210\260\344\270\200\344\272\233\351\227\256\351\242\230.md"
src/test/ccbackend/Pipfile
src/test/ccbackend/Pipfile-bak
src/test/ccbackend/Pipfile.lock
(END)

But it is NOT as here show. The subtree ccfront files are also added. I don't know why. But that only happens when I rebase the result codes in "parent” repository. So I feel it seems a bug.

Has anyone also experienced that ?

git show --stat --name-only e5e7f5cd
commit e5e7f5cdc8e6a385fbf98788ad0f0e1994864d6a
Author: qiulang@macbook3 <[email protected]>
Date:   Mon Apr 26 17:18:05 2021 +0800

Squashed 'wsserver/' changes from 5e997710..1cc96493

1cc96493 过去开发文档整理
18f68394 3.8
d13ece2a test

git-subtree-dir: wsserver
git-subtree-split: 1cc96493f7fac9ecb2e2fe4bf4436586345cc182

.env
.gitignore
.prettierrc
.vscode/launch.json
Dockerfile
builder.py
ccfront/.babelrc
ccfront/.dockerignore
ccfront/.editorconfig
ccfront/.env
ccfront/.eslintignore
ccfront/.eslintrc.js
ccfront/.gitignore
ccfront/.postcssrc.js
ccfront/.vscode/launch.json
ccfront/.vscode/settings.json
ccfront/Dockerfile
ccfront/README.md
"ccfront/api.js review\346\204\217\350\247\201.md"
ccfront/build/build.js
ccfront/build/check-versions.js
ccfront/build/sed.js
ccfront/build/utils.js
ccfront/build/vue-loader.conf.js
ccfront/build/webpack.base.conf.js

Upvotes: 0

Views: 215

Answers (1)

VonC
VonC

Reputation: 1323753

Check if the issue persists with Git 2.39 (Q4 2022): it includes a bugfix to git subtree in its split and merge features.

See commit 1762382, commit 0d33067, commit f10d31c, commit 7990142, commit 34ab458, commit 5626a9e, commit 2e94339, commit a50fcc1, commit 455f0ad (21 Oct 2022) by Philippe Blain (phil-blain).
(Merged by Taylor Blau -- ttaylorr -- in commit a23e0b6, 30 Oct 2022)

subtree: fix squash merging after annotated tag was squashed merged

Signed-off-by: Philippe Blain

When git subtree merge --squash(man) $ref' is invoked, either directly or through git subtree pull --squash(man) $repo $ref', the code looks for the latest squash merge of the subtree in order to create the new merge commit as a child of the previous squash merge.

This search is done in function 'process_subtree_split_trailer', invoked by 'find_latest_squash', which looks for the most recent commit with a 'git-subtree-split' trailer; that trailer's value is the object name in the subtree repository of the ref that was last squash-merged.
The function verifies that this object is present locally with 'git rev-parse'(man), and aborts if it's not.

The hash referenced by the 'git-subtree-split' trailer is guaranteed to correspond to a commit since it is the result of running 'git rev-parse -q --verify'(man)$1^{commit}"' on the first argument of 'cmd_merge' (this corresponds to 'rev' in 'cmd_merge' which is passed through to 'new_squash_commit' and 'squash_msg').

But this is only the case since e4f8baa ("subtree: parse revs in individual cmd_ functions", 2021-04-27, Git v2.32.0-rc0 -- merge listed in batch #15), which went into Git 2.32.
Before that commit, 'cmd_merge' verified the revision it was given using 'git rev-parse --revs-only'(man)$@"'.
Such an invocation, when fed the name of an annotated tag, would return the hash of the tag, not of the commit referenced by the tag.

This leads to a failure in 'find_latest_squash' when squash-merging if the most recent squash-merge merged an annotated tag of the subtree repository, using a pre-2.32 version of git subtree, unless that previous annotated tag is present locally (which is not usually the case).

We can fix this by fetching the object directly by its hash in 'process_subtree_split_trailer' when 'git rev-parse' fails, but in order to do so we need to know the name or URL of the subtree repository.
This is not possible in general for git subtree merge, but is easy when it is invoked through git subtree pull since in that case the subtree repository is passed by the user at the command line.

Allow the git subtree pull scenario to work out-of-the-box by adding an optional 'repository' argument to functions 'cmd_merge', 'find_latest_squash' and 'process_subtree_split_trailer', and invoke 'cmd_merge' with that 'repository' argument in 'cmd_pull'.

If 'repository' is absent in 'process_subtree_split_trailer', instruct the user to try fetching the missing object directly.

Upvotes: 1

Related Questions