Mort
Mort

Reputation: 3549

Git corruption "unable to read [sha]" but git fsck reports no errors

I have a git repository with what appears to be a missing blob. A git gc or a git repack fails complaining "fatal: unable to read 89a9259486af9e3f0b24f3338ec39b18a7ba39c3". However, a git fsck does not find the issue. I know I'll probably have to delete and prune a branch somewhere, but I can't figure out where. Can somebody point me to how to debug and fix the "unable to read" issue?

git version is 2.16.4, but it is possible the corruption occurred in version 2.8.3.

The blob is not one that exists in the "offical" repo, so it likely just belongs to a local branch/reflog/etc. There are many local branches and

There are many worktrees on this repo, and it may have had worktrees added, removed, and pruned during its lifetime.

debugging information:

git repack -adfb --max-pack-size=256m --window=40 --window-memory=100m Counting objects: 5999778, done. Delta compression using up to 4 threads. Compressing objects: 100% (5983452/5983452), done. warning: disabling bitmap writing, packs are split due to pack.packSizeLimit fatal: unable to read 89a9259486af9e3f0b24f3338ec39b18a7ba39c3

I've tried a few different fsck command-lines all with the same results:

$ > git fsck --cache --no-dangling --name-objects --progress
Checking object directories: 100% (256/256), done.
Checking objects: 100% (14155357/14155357), done.
Checking connectivity: 6003771, done.

.

git show 89a9259486af9e3f0b24f3338ec39b18a7ba39c3
fatal: bad object 89a9259486af9e3f0b24f3338ec39b18a7ba39c3

.

$ > git branch --contains 89a9259486af9e3f0b24f3338ec39b18a7ba39c3 --all
error: no such commit 89a9259486af9e3f0b24f3338ec39b18a7ba39c3

This is a script I previously got off the internet for other purposes, but I though it might help:

$ > /tmp/git_blob_to_commit.pl 89a9259486af9e3f0b24f3338ec39b18a7ba39c3
[no ouptput]

Note that this is a huge repo so gc/repack operations take a very long time so if you give me some advice I am not ignoring it, I am probably trying it but it will be hours before I can get back to you with how it went.

Update re-running the command pressing [return] a couple of times and you can see that the error is not in the compressing phase. It is perhaps in the writing phase. (?)

Counting objects: 6006957, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5990610/5990610), done.
Writing objects:  19% (1193602/6006957)
warning: disabling bitmap writing, packs are split due to pack.packSizeLimit
Writing objects:  26% (1579434/6006957)
Writing objects:  63% (3802470/6006957)
fatal: unable to read 89a9259486af9e3f0b24f3338ec39b18a7ba39c3

Upvotes: 4

Views: 825

Answers (2)

Mort
Mort

Reputation: 3549

This is a tricky scenario where older versions of git would incorrectly prune objects that were actually in use by the index on a worktree.

Here is the rough approach that I took. It could surely be optimized, but I hope to never have to do it again.

for i in $(git worktree list | awk '{print $1}')
do
    cd $i
    echo "TITLE $i"
    git ls-files --stage
done  >> /tmp/blobs.txt         # This is potentially a massive file

for i in $(cat /tmp/blobs.txt | awk '{print $2}')  # Brute force, could be optimized
do
    git show $i >/dev/null || echo "NOT FOUND $i"
done

For each "NOT FOUND" entry, run egrep "TITLE|<sha>" /tmp/blobs.txt to find the worktree it's in. Then go to the worktree and unstage anything in the index. That should fix the problem(s).


Thanks @torek for providing the information to get to this conclusion. (You have enough SO reputation that I don't think you'll mind not getting the points for this answer.)

Upvotes: 2

max630
max630

Reputation: 9258

git repack -adfb --max-pack-size=256m --window=40 --window-memory=100m
...
Compressing objects: 100% (5983452/5983452), done.
...
fatal: unable to read 89a9259486af9e3f0b24f3338ec39b18a7ba39c3

it looks like the object is unreferenced from anywhere, otherwise you would not pass the "Compressing" phase, and the failure happens during cleaning objects. You could verify it with running fsck with --dangling and --unreachable - it would print it in the list or even fail on it.

Upvotes: 1

Related Questions