Cactus
Cactus

Reputation: 27626

`git submodule update` always keep fetching the same commit

I have a Git repo with one submodule and two commits that change the submodule's HEAD. If I check out one of the commits, and do a git submodule update, this succeeds without having to connect to a remote, as I would expect since I have the given commit available locally. If I check out the other commit, however, git submodule update always does a fetch (and fails if the remote is not accessible), even though it keeps re-fetching the same commit.

Example session:

$ git checkout -q 9089923 && git submodule update
Submodule path 'chirp8-engine': checked out '341b24f2c3168a6e226ae3c249179426eff6dd99'
$ git checkout -q 289a15d && git submodule update
From github.com:gergoerdi/chirp8-engine
 * branch            2fd6ace109f6337df97056aacdb40e5fb51f0e97 -> FETCH_HEAD
Submodule path 'chirp8-engine': checked out '2fd6ace109f6337df97056aacdb40e5fb51f0e97'
$ git checkout -q 9089923 && git submodule update
Submodule path 'chirp8-engine': checked out '341b24f2c3168a6e226ae3c249179426eff6dd99'
$ git checkout -q 289a15d && git submodule update
From github.com:gergoerdi/chirp8-engine
 * branch            2fd6ace109f6337df97056aacdb40e5fb51f0e97 -> FETCH_HEAD
Submodule path 'chirp8-engine': checked out '2fd6ace109f6337df97056aacdb40e5fb51f0e97'
$ git checkout -q 9089923 && git submodule update
Submodule path 'chirp8-engine': checked out '341b24f2c3168a6e226ae3c249179426eff6dd99'
$ git checkout -q 289a15d && git submodule update
From github.com:gergoerdi/chirp8-engine
 * branch            2fd6ace109f6337df97056aacdb40e5fb51f0e97 -> FETCH_HEAD
Submodule path 'chirp8-engine': checked out '2fd6ace109f6337df97056aacdb40e5fb51f0e97'

What is the reason for this, and how do I change it so that git submodule update doesn't keep re-fetching the same commit?

Edited to add

I started tracing git, and the difference all comes down to what git submodule--helper does:

$ GIT_TRACE=1 git submodule--helper run-update-procedure --oid 341b24f2c3168a6e226ae3c249179426eff6dd99  -- chirp8-engine
11:08:30.962038 git.c:455               trace: built-in: git submodule--helper run-update-procedure --oid 341b24f2c3168a6e226ae3c249179426eff6dd99 -- chirp8-engine
11:08:30.962259 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 341b24f2c3168a6e226ae3c249179426eff6dd99 --not --all
11:08:30.963983 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 341b24f2c3168a6e226ae3c249179426eff6dd99 --not --all
11:08:30.965639 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git checkout -q -f 341b24f2c3168a6e226ae3c249179426eff6dd99
11:08:30.966504 git.c:455               trace: built-in: git checkout -q -f 341b24f2c3168a6e226ae3c249179426eff6dd99
Submodule path 'chirp8-engine': checked out '341b24f2c3168a6e226ae3c249179426eff6dd99'
$ GIT_TRACE=1 git submodule--helper run-update-procedure --oid 2fd6ace109f6337df97056aacdb40e5fb51f0e97  -- chirp8-engine 
11:08:35.917283 git.c:455               trace: built-in: git submodule--helper run-update-procedure --oid 2fd6ace109f6337df97056aacdb40e5fb51f0e97 -- chirp8-engine
11:08:35.917522 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 2fd6ace109f6337df97056aacdb40e5fb51f0e97 --not --all
11:08:35.919289 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git fetch
11:08:35.920134 git.c:455               trace: built-in: git fetch
11:08:35.920490 run-command.c:668       trace: run_command: unset GIT_DIR GIT_PREFIX; GIT_PROTOCOL=version=2 ssh -o SendEnv=GIT_PROTOCOL [email protected] 'git-upload-pack '\''gergoerdi/chirp8-engine.git'\'''
11:08:38.965515 run-command.c:668       trace: run_command: git rev-list --objects --stdin --not --all --quiet --alternate-refs
11:08:39.272774 run-command.c:1597      run_processes_parallel: preparing to run up to 1 tasks
11:08:39.272791 run-command.c:1629      run_processes_parallel: done
11:08:39.272799 run-command.c:668       trace: run_command: git maintenance run --auto --no-quiet
11:08:39.273925 git.c:455               trace: built-in: git maintenance run --auto --no-quiet
11:08:39.274557 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 2fd6ace109f6337df97056aacdb40e5fb51f0e97 --not --all
11:08:39.276381 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git fetch origin 2fd6ace109f6337df97056aacdb40e5fb51f0e97
11:08:39.277324 git.c:455               trace: built-in: git fetch origin 2fd6ace109f6337df97056aacdb40e5fb51f0e97
11:08:39.277686 run-command.c:668       trace: run_command: unset GIT_DIR GIT_PREFIX; GIT_PROTOCOL=version=2 ssh -o SendEnv=GIT_PROTOCOL [email protected] 'git-upload-pack '\''gergoerdi/chirp8-engine.git'\'''
11:08:42.446569 run-command.c:668       trace: run_command: git rev-list --objects --stdin --not --all --quiet --alternate-refs
From github.com:gergoerdi/chirp8-engine
 * branch            2fd6ace109f6337df97056aacdb40e5fb51f0e97 -> FETCH_HEAD
11:08:42.754265 run-command.c:1597      run_processes_parallel: preparing to run up to 1 tasks
11:08:42.754283 run-command.c:1629      run_processes_parallel: done
11:08:42.754295 run-command.c:668       trace: run_command: git maintenance run --auto --no-quiet
11:08:42.755355 git.c:455               trace: built-in: git maintenance run --auto --no-quiet
11:08:42.755976 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git checkout -q -f 2fd6ace109f6337df97056aacdb40e5fb51f0e97
11:08:42.756942 git.c:455               trace: built-in: git checkout -q -f 2fd6ace109f6337df97056aacdb40e5fb51f0e97
Submodule path 'chirp8-engine': checked out '2fd6ace109f6337df97056aacdb40e5fb51f0e97'

In particular, the trace diverges right at the start:

$ GIT_TRACE=1 git submodule--helper run-update-procedure --oid 341b24f2c3168a6e226ae3c249179426eff6dd99  -- chirp8-engine
11:08:30.962038 git.c:455               trace: built-in: git submodule--helper run-update-procedure --oid 341b24f2c3168a6e226ae3c249179426eff6dd99 -- chirp8-engine
11:08:30.962259 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 341b24f2c3168a6e226ae3c249179426eff6dd99 --not --all
11:08:30.963983 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 341b24f2c3168a6e226ae3c249179426eff6dd99 --not --all

vs

$ GIT_TRACE=1 git submodule--helper run-update-procedure --oid 2fd6ace109f6337df97056aacdb40e5fb51f0e97  -- chirp8-engine 
11:08:35.917283 git.c:455               trace: built-in: git submodule--helper run-update-procedure --oid 2fd6ace109f6337df97056aacdb40e5fb51f0e97 -- chirp8-engine
11:08:35.917522 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 2fd6ace109f6337df97056aacdb40e5fb51f0e97 --not --all
11:08:35.919289 run-command.c:668       trace: run_command: cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git fetch

Indeed, there is a difference in the output of cd chirp8-engine; unset GIT_PREFIX; GIT_DIR=.git git rev-list -n 1 $SUBMODULE_COMMIT --not --all depending on which commit is checked out in the submodule, but that should be expected since 2fd6ace is an ancestor of 341b24f.

Upvotes: 1

Views: 349

Answers (1)

torek
torek

Reputation: 487785

git submodule update generally means this:

  • get a list of submodules
  • for each submodule:
    • find a commit hash ID: it's stored in the superproject's index, so run git rev-parse :chirp8-engine for instance;
    • cd into the submodule repository working tree;
    • run git fetch if necessary to get that commit; and
    • run git switch --detach <hash> to check out that commit.

That's what you're seeing here: this is how submodules are supposed to work. It's the superproject, not the submodule, that "knows which commit to use".

With that in mind:

git submodule update always does a fetch (and fails if the remote is not accessible), even though it keeps re-fetching the same commit.

The command could probably be smarter, but it's not.

There are ways to make it act differently:

  • git submodule update --no-fetch: this tells it never run git fetch. If you don't already have the commit you need, this makes the git switch --detach step fail, but it does avoid the git fetch.

  • git submodule update --remote: this changes how the hash ID step (done first above) works. Instead of using git rev-parse :<path> to read the hash ID from the superproject's index, Git will:

    • run git fetch early (unless suppressed by --no-fetch in which case there's no fetch at all); then
    • use git rev-parse in the submodule to obtain a hash ID, using a name constructed by pasting the remote name and a branch name together. The result of this pasting-up is typically a remote-tracking name like origin/somebranch where somebranch is the name recorded in submodule.<path>.branch.

    The general idea behind this sequence is to obtain a possibly-never-seen-before hash ID, since git fetch might update some remote-tracking names. Suppose that submodule.foo.branch is develop, and we run git fetch origin and get a new hash ID for origin/develop in the Git repository that holds the foo submodule. We now run git rev-parse origin/develop within that submodule to get the hash ID. So we'll now update to (or merge or rebase using) the resulting hash ID.

  • --checkout, --merge, --rebase: these control what gets done with the hash ID once it's obtained. The operation (switch --detach, merge, or rebase) happens in the submodule repository and using the hash ID obtained by the obtain-hash-ID step.

What I think you're asking for here is simply the --no-fetch option. However, submodules are a big pain in the <insert body part>; making them work "as desired" is often very tricky, in part because we're often not even sure what "desired" means in the first place. One way or another, we all discover, when we use submodules, why people call them sob-modules....

Upvotes: 3

Related Questions