Christopher Best
Christopher Best

Reputation: 476

git tree contains duplicate file entries

I struggled with some line-ending problems about 20 commits back and some weird things happened. Now git fsck shows:

Checking object directories 100% (256/256), done.
error in tree ee2060e71cb36d33be5ddc1fe9ca8d7dd0ab35cd: contains duplicate file entries
Checking objects: 100% (8633/8633), done.

and git show ee2060 shows:

File1.cs
File2.cs
File2.cs
File2.cs
File3.cs

This is preventing me from pushing to my remote. git push shows:

error: unpack failed: index-pack abnormal exit
To https://github.com/username/Project.git
 ! [remote rejected] master -> master (n/a (unpacker error))
error: failed to push some refs to 'https://github.com/username/Project.git'

I have tried repacking and garbage collecting. How can I resolve this problem?

Upvotes: 13

Views: 9250

Answers (6)

Julius G
Julius G

Reputation: 11

Easily fixed it using BFG repo cleaner where Henry Wilson and Bl00dh0und's answer/comment above pointed me in the right direction to solve it for my case, here the commands I used:

# >> Clone repo with issue as mirror:
$ git clone --mirror git://example.com/some-repo.git

# >> Confirm error:
$ cd some-repo.git
$ git fsck
Checking object directories: 100% (256/256), done.
error in tree f7c504b6d4c1bc9b65787b29ff1606ecc3674e91: duplicateEntries: contains duplicate file entries
...

# >> Execute BFG's jar file (requires Java run time)
$ java -jar bfg.jar --fix-filename-duplicates-preferring tree some-repo.git

# >> Per BFG's output direction
$ cd some-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

# >> Confirm fix / no error:
$ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (287096/287096), done.
Verifying commits in commit graph: 100% (11513/11513), done

Fixed!

Upvotes: 0

TonyH
TonyH

Reputation: 1181

I used git-replace and git-mktree to fix this in the past. You essentially keep the broken tree object, but override all links and make them point to a new object.

  1. First we grab the bad tree:git ls-tree bad_tree_hash > tmpfile.txt This writes out your bad tree. For example:

     040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 →  xNotAlphabetic
     040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb →  EXT
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 →  ToolTester
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
    

NB, · & → are whitespace [space] and [tab]

  1. Next, edit the text, removing the offending lines, and save with Unix-style endings (ie only LF, not CRLF). With this example, we make this:

     040000·tree·4ad0d8ef014b8cc09c95694399254eff43217bfb →  EXT
     040000·tree·d65085e4a05ea9ac8b79e37b87202dd64d402c2e →  duplicateFolder
     040000·tree·fd0661d698ace91135a8473b26707892b7c89c32 →  ToolTester
     040000·tree·3cdcc756ee0ed636c44828927126911d0ab28a18 →  xNotAlphabetic
    
  2. Type cat tmpfile.txt | git mktree which will make a new, fixed tree object and save it, and return the new hash: a55115e4a05ea9ac8b79e37b872024d64d4r2c2e a.k.a. for demo purposes new_tree_hash

  3. Next git replace will create a new reference, which forces all previously incident links to use the new, fixed object instead. git replace bad_tree_hash new_tree_hash

This will solve your immediate problem. If you're interested, look at the overriding link in the .git/refs/replace folder.


The bad tree object will continue to generate warnings whenever you do a check on your repository with git fsck, but it can be ignored, and all your commits and other links will be consistent and working regardless.


8 year retrospective: There's probably a way to just delete the old, corrupt tree since git replace should make it moot.

Further warning: This hack could also be rejected by a git service eg BitBucket or GitHub, since they could view it as corruption.

Upvotes: 9

Henry Wilson
Henry Wilson

Reputation: 3351

I had a problem of this ilk and all the solutions here and in other SO threads failed to fix it for me. In the end I used BFG repo cleaner to destroy all the commits which references the bad folder name, which was probably overkill but successfully repaired the repo.

Upvotes: 2

Christopher Best
Christopher Best

Reputation: 476

I finally fixed the repo by doing the following

  1. do a fresh clone from github, which only included commits before the problem occurred
  2. add my messed up repo from the filesystem as a remote on the new clone
  3. painstakingly check out commits from the bad repo into the working copy of the new clone

    git checkout fe3254FIRSTCOMMITAFTERORIGIN/MASTER/HEAD . // note the dot at the end
    // without the dot, you move your head to the commit instead of the commit
    // to the working copy, and seems to bring the corrupt object into your good clone
    
  4. commit each in turn, manually copying the commit message from the other repo
  5. remove the corrupt repo from remotes
  6. garbage collect + prune

    git gc --aggressive --prune=now
    
  7. weep happily as git fsck shows no duplicate file entries

Upvotes: 5

Adam Dymitruk
Adam Dymitruk

Reputation: 129654

checkout a new branch just before the problematic commit. now checkout the files from the problematic commit. Now add and commit them using the same message ( use the -C option ). Repeat for the rest of the commits. After you're done, reset the other branch to point to this correct one. You can then push.

Upvotes: 1

che
che

Reputation: 12273

Rebasing your commits again might fix it. If that does not help then you can use git low-level commands (git-cat-file) to see what commits contains this weird tree object, and reconstruct put there a correct version of the tree without the duplicates. However, I don't know of any automatic tools that might be able to fix this, and you'll probably have to change all the tree and commit object that already link to the weird one.

By the way, git ls-tree ee2060 should show you more details about the data that are in the damaged tree, such as files that are referenced there.

Upvotes: 0

Related Questions