DanJ
DanJ

Reputation: 1756

MongoDB locked after aborted db.repairDatabase()? How to unlock?

I tried doing a db.repairDatabase() command from a mongo shell on a healthy but large MongoDB database. It was running for about 10 hours and it still did not complete. For better or worse, I hit Ctrl-C to cancel it.

It appears that the cluster has been left in some locked state. Commands like "show dbs" all fail with "Operation timed out":

mongos> show dbs
2016-06-10T09:38:10.179-0400 E QUERY    [thread1] Error: listDatabases failed:{ "code" : 50, "ok" : 0, "errmsg" : "Operation timed out" } :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
Mongo.prototype.getDBs@src/mongo/shell/mongo.js:62:1
shellHelper.show@src/mongo/shell/utils.js:760:19
shellHelper@src/mongo/shell/utils.js:650:15
@(shellhelp2):1:1

It has been like this for about 10 more hours now after I killed the db.repairDatabase().

What is the correct way to recover from this?

My cluster info: I am running MongoDB 3.2.5 everywhere. I have 3 config servers, 11 shards, each shard is a replica set consisting of 2 nodes plus an arbiter. And I have about 40 nodes running mongos instances. The 3 config servers are still 3.0-style (not yet upgraded to replica-set).

Upvotes: 3

Views: 1046

Answers (1)

DanJ
DanJ

Reputation: 1756

Well for what it's worth I was able to bring the cluster back as follows:

  1. Restarted all mongos services.
  2. Restarted all mongod arbiters.
  3. Restarted mongod for all 3 config servers.
  4. Restarted mongod for 1 node from each of my 11 shards' replica sets.
  5. Restarted mongod for the other 1 node from each of my 11 shards' replica sets.

Steps 1 thru 4 didn't fix anything.

But after I ran step 5 I was able to once again use all the databases. Things seem to be back to normal now.

Upvotes: 1

Related Questions