Pieter De Schepper
Pieter De Schepper

Reputation: 476

MongoDb corrupt collection?

I have a sharded Mongo 2.4.3 setup

When I want to try to run a query on one of my collections, i get the error exception: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO

I've already ran a db.repairDatabase, but that didn't seem to have helped much.

When I just do db.collection.findOne(), it runs fine, but when i do e.g. db.collection.find({'_id': {'$gte': ObjectId("52631d000000000000000000")}}, i get the error above.

I've read that this might is because of a corrupt index, so I tried:

db.collection.reIndex();
{
    "raw" : {
        "rs0/ec2-xx-xxx-xxx-xxx.us-west-2.compute.amazonaws.com:27017,ec2-xx-xxx-xxx-xxx.us-west-2.compute.amazonaws.com:27017,ec2-xx-xx-xxx-xx.us-west-2.compute.amazonaws.com:27017" : {
            "nIndexesWas" : 2,
            "errmsg" : "exception: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO",
            "code" : 10334,
            "ok" : 0
        }
    },
    "ok" : 0,
    "errmsg" : "{ rs0/ec2-xx-xxx-xxx-xxx.us-west-2.compute.amazonaws.com:27017,ec2-xx-xxx-xxx-xxx.us-west-2.compute.amazonaws.com:27017,ec2-xx-xxx-xxx-xx.us-west-2.compute.amazonaws.com:27017: \"exception: BSONObj size: 0 (0x00000000) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO\" }"
}

I've also dropped all indexes before doing this, only the one on _id is left

db.collection.getIndexes();
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "ns" : "db.collection",
        "name" : "_id_"
    }
]

Still no luck and I'm running out of ideas here. Anyone has another suggestion? The collection has about 900M documents, so I realy need to recover this

Upvotes: 4

Views: 12158

Answers (1)

Adam Comerford
Adam Comerford

Reputation: 21682

If you have run repair successfully already, then you have already rewritten all of your data from scratch and then rebuilt the indexes, so you have a brand new _id index already - you've effectively dropped and rebuilt that index already.

You have a replica set (rs0), so what happens if you step down the current primary? - is the corruption on the other nodes also, or just one?

If just one, (or at the very least you have one node that does not have the issue) then wipe the instances with corruption and have them resync, it will pull the data from one of the remaining "good" nodes and get rid of the corruption. This would be the preferred method (by far) to get rid of this corruption, it's one of the intended uses of replica sets in the first place.

If that is not an option, and all of the nodes show the same problem (this would be odd), then you should attempt to do the following:

  1. Run the entire instance (i.e. shut down the mongod process, and then restart) with --repair, and have it rebuild all data (local, all databases, all indexes) from scratch.
  2. If, and only if that does not remove the corruption, then the approach of last resort is to try mongodump --repair - this method will try all reasonable ways to resolve the issue, and may produce duplicate documents (depending on the level of corruption). Hence you may need more disk space than was originally used by mongod for storage for this to complete.

Finally, you should upgrade to a later version of MongoDB, particularly in a sharded environment. As of writing this answer, the current version is 2.4.9 and it has several important bug fixes - full details can be found in the release announcements over on mongodb-announce.

Upvotes: 6

Related Questions