Ben
Ben

Reputation: 862

What would happen if two Git commits had the same SHA-1 hash?

Let me prefix this by saying that I am aware of the extremely minuscule odds of this happening. I know that it would be more or less impossible to manufacture, and extremely unlikely to happen "in the wild." This is simply a what-if question about the internals of Git.

So, here is my question: what would happen if two Git commit hashes were identical? For starters:

Upvotes: 24

Views: 7178

Answers (2)

VonC
VonC

Reputation: 1323025

My old answer "How would git handle a SHA-1 collision on a blob?" would still apply, even for a commit and not a blob.
As torek mentions in the comments, git just thinks of everything as "objects", each with their own SHA1.

https://git-scm.com/book/en/v2/book/10-git-internals/images/data-model-4.png

(Image from Git Internals - Git References chapter of the ProGit Book v2)

While the commit would likely not succeed (there are a couple of checks in git-commit-tree.c), you also have to consider the case where two commits with the same SHA1 (and somehow different content) are created in repos A and B... and repo A is fetching repo B!
Commit 8685da4 (March 2007, git 1.5.1) took care of that, and the fetch would fail.
Commit 0e8189e (Oct. 2008, git 1.6.1) does mention that, with index V2:

the odds for a SHA1 reference to get corrupted so it actually matches the SHA1 of another object with the same size (the delta header stores the expected size of the base object to apply against) are virtually zero.

It still implements a packed object CRC check when unpacking objects.

The Git code mentioned below is the finalize_object_file() function, and a blame shows no recent modification, most of the code dating back from the very beginning of Git (2005): no new commit is created.

Upvotes: 9

user803422
user803422

Reputation: 2814

According to the source code (present in git v2.17), if a commit lead to an already existing sha1, this is what would happen on Linux (for other operating systems it might be different).

Would the commit succeed?

Yes and no: the git commit command would return as if in success, but the new commit object would not be created.

Could it later be checked out as a detached head?

No.

Reference : file sha1-file.c (commit fc1395f4a491a7da46a446233531005634eb979d)

int finalize_object_file(const char *tmpfile, const char *filename)
{
    int ret = 0;

    if (object_creation_mode == OBJECT_CREATION_USES_RENAMES)
        goto try_rename;
    else if (link(tmpfile, filename))
        ret = errno;

    /*
     * Coda hack - coda doesn't like cross-directory links,
     * ...
     */
    if (ret && ret != EEXIST) {
    try_rename:
        if (!rename(tmpfile, filename))
            goto out;
        ret = errno;
    }
    unlink_or_warn(tmpfile);
    if (ret) {
        if (ret != EEXIST) {
            return error_errno("unable to write sha1 filename %s", filename);
        }
        /* FIXME!!! Collision check here ? */
    }

out:
    if (adjust_shared_perm(filename))
        return error("unable to set permission to '%s'", filename);
    return 0;
}

The link fails with EEXIST, the temporary file is removed, and the code continues until the return 0 (through the FIXME, and the adjust_shared_perm() which has no reason to fail).

Upvotes: 2

Related Questions