Andrew Arnott
Andrew Arnott

Reputation: 81791

How to find the commit(s) that point to a git tree object?

In trying to mirror a repo to a remote server, the server is rejecting tree object 4e8f805dd45088219b5662bd3d434eb4c5428ec0. This is not a top-level tree, by the way but a subdirectory.

How can I find out which commit(s) indirectly reference that tree object so I can avoid pushing the refs that link to those commits in order to get all the rest of my repo to push properly?

Upvotes: 21

Views: 8170

Answers (2)

torek
torek

Reputation: 488193

As you noted, you just need to find the commit(s) with the desired tree. If it could be a top level tree you would need one extra test, but since it's not, you don't.

You want:

  • for some set of commits (all those reachable from a given branch name, for instance)
  • if that commit has, as a sub-tree, the target tree hash: print the commit ID

which is trivial with two Git "plumbing" commands plus grep.

Here's a slightly updated version of my original script (updated to take arguments, and default to --all as in badp's edit):

#! /bin/sh
#
case $# in
0) echo "usage: git-searchfor <object-id> [<starting commit>...]" 1>&2; exit 1;;
esac

searchfor=$(git rev-parse --verify "$1") || exit 1
searchfor=$(git rev-parse --verify "$searchfor"^{tree}) || exit 1
shift
  
git log ${@-"--all"} --pretty='format:%H' |
    while read commithash; do
        if git ls-tree -d -r --full-tree $commithash | grep $searchfor; then
            echo " -- found at $commithash"
        fi
    done

To check top-level trees you would git cat-file -p $commithash as well and see if it has the hash in it.

Note that this same code will find blobs (assuming you take out the -d option from git ls-tree). However, no tree can have the ID of a blob, or vice versa. The grep will print the matching line so you'll see, e.g.:

040000 tree a3a6276bba360af74985afa8d79cfb4dfc33e337    perl/Git/SVN/Memoize
 -- found at 3ab228137f980ff72dbdf5064a877d07bec76df9

To clean this up for general use, you might want to use git cat-file -t on the search-for blob-or-tree to get its type.

As jthill notes in a comment, git diff-tree now has a --find-object option. This was introduced in Git 2.17 (released in 2018, well after the original question here). The git log command has this as well, but we're usually more interested in which specific commit added a file or tree. By removing the extra line that tries to force the searchfor hash ID to be a tree, we can get a much faster script that finds either every occurrence of any tree or blob object (though you must take care to specify the correct hash ID or use the ^{tree} suffix yourself if you're going to supply a commit hash ID). Then we just run:

git log --all --find-object=$searchfor

or, as in the comment below:

git rev-list --all | git diff-tree --stdin --find-object=$searchfor

to find what we're looking for. (Add ${2-"--all"} if/as desired.)

Upvotes: 27

shawkinaw
shawkinaw

Reputation: 3180

Variation of great answer by torek in case you want to speed things up via GNU Parallel:

#!/bin/bash    
searchfor="$1"
startpoints="${2-HEAD}"

git rev-list "$startpoints" |
    parallel "if git ls-tree -d -r --full-tree '{}' | grep '$searchfor'; then echo ' -- found at {}'; fi"

Upvotes: 3

Related Questions