sqln00b
sqln00b

Reputation: 391

Git: Recognize/identify version(s) of untracked copy / match to according commit(s)

TL;DR

For whatever reason sometimes you (at least I) end up copying a certain state of your project "outside" your repo.

Is there a way for git to compare such files to all blobs and correctly match the according commit(s)?

Example

I have 10 commits in my project.

  1. a) At commit #6 I send an archive of the project per mail
  2. b1) At commit #6 for whatever reason I copy my project to an untracked place
  3. b2) At commit #6 for whatever reason I copy my project to an untracked place and make changes

Months later I find the copy but don't remember if I (accidentally) made any changes on them.

Now I want to find out what commit(s) they match or if they match any of my commits at all (usually to find out whether I can delete them or not).

Ideally I can find out what commit(s) they match most and how many added, deleted and modified lines there are compared to each commit.

Can git do that by itself? Is there any other tool that's able to do that?

Disclaimer

English is not my mother tongue, please feel free to correct/edit/restructure this question

Upvotes: 1

Views: 83

Answers (2)

ElpieKay
ElpieKay

Reputation: 30868

  1. Make the archive into a git repo.

    git init git add . git commit -m 'hello world' git log -1 --pretty=raw

We can get a line tree <40-digit-sha1>

  1. Find the commit that points to the same tree in your original project.

    git log --pretty=raw | grep -B 1 <40-digit-sha1>

  2. If 2 commits point to the same tree, the 2 archives made from these 2 commits should have the same contents.

Upvotes: 0

torek
torek

Reputation: 488183

There is nothing built in to Git to do that.

There is a relatively easy method you can achieve with a script, by adding a new commit (or at least a tree, we don't need a commit) to your repository consisting of the archived version. This will only work if the new tree is bit-for-bit identical to the original, both in terms of file names and contents, and permissions (executable vs not-executable). For instance, if you left out the .gitignore when sending the files, the new tree won't match the actual commit that has the .gitignore file.

Here is a way to do it, written up as an outline:

  1. create an empty temporary index
  2. git add every file in the test tree to this temporary index
  3. use git write-tree to write the temporary index into the repository as a tree

The output of git write-tree in step 3 is a tree ID. Now you need only (only?!) visit every commit in the repository, or every commit of interest at least, and compare its tree object to the ID you just got:

GIT_INDEX_FILE=$(mktemp) || exit $?
export GIT_INDEX_FILE
git add ...
tree=$(git write-tree) || exit $?
git rev-list --all | while read hash; do
    commit=$(git rev-parse -q --verify $hash^{commit} 2>/dev/null) || continue
    testtree=$(git rev-parse $commit^{tree})
    if [ $testtree = $tree ]; then
        echo "test tree matches existing commit $commit"
        [ $commit != $hash ] &&
            echo "(via $hash, which is a $(git cat-file -t $hash))"
        echo git describe says: $(git describe $commit)
    fi
done

(this is not tested at all, and is missing some clean-up code, such as removing the temporary index).

Upvotes: 1

Related Questions