Kevin Burke
Kevin Burke

Reputation: 64854

Compute Github API file SHA

I have a file whose contents are "from test" - 9 bytes. The documentation says that the SHA of created files is computed using SHA1:

The file's SHA-1 hash is computed and stored in the blob object.

(from https://developer.github.com/v3/git/blobs/)

However, when I compute the hex-encoded SHA1 output of "from test", I get 5669556d9a5c27fdd649dcaaa0873757c2aa402f.

The Github API says that the SHA is 62b551731eada762035d4665978027cd44291290 - this is the ETag returned and the value for "sha" in the API response for retrieving a file. In addition, when I call the CreateFile endpoint with "from test" as the value and 566955... as the sha, I'm told that the SHA is incorrect.

I've also tried appending newlines, computing the SHA of the base64 encoded value of the content, computing the SHA of base64+ a trailing newline, and none of them give me 62b551731eada762035d4665978027cd44291290. How is Github computing that value?

I've double checked the contents of the remote file are the same - "from test" - somehow the SHA is still different.

Upvotes: 4

Views: 1640

Answers (1)

Kevin Burke
Kevin Burke

Reputation: 64854

Ah - GitHub is computing the sum of blob <length>\x00<contents>, where length is the length in bytes of the content string and \x00 is a single null byte.

sha1("blob 9\x00from test") yields the correct sum!

See https://stackoverflow.com/a/7225329/329700 for more info.

Upvotes: 5

Related Questions