Reputation: 1711
I set up a replica set with three members and one of them is an arbiter.
One time I restart a member, the member keep RECOVERING for a long time and did not be SECONDARY again, even though the database was not large.
The status of replica set is like that:
rs:PRIMARY> rs.status()
{
"set" : "rs",
"date" : ISODate("2013-01-17T02:08:57Z"),
"myState" : 1,
"members" : [
{
"_id" : 1,
"name" : "192.168.1.52:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 67968,
"optime" : Timestamp(1358388479000, 1),
"optimeDate" : ISODate("2013-01-17T02:07:59Z"),
"self" : true
},
{
"_id" : 2,
"name" : "192.168.1.50:29017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 107,
"lastHeartbeat" : ISODate("2013-01-17T02:08:56Z"),
"pingMs" : 0
},
{
"_id" : 3,
"name" : "192.168.1.50:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 58,
"optime" : Timestamp(1358246732000, 100),
"optimeDate" : ISODate("2013-01-15T10:45:32Z"),
"lastHeartbeat" : ISODate("2013-01-17T02:08:55Z"),
"pingMs" : 0,
"errmsg" : "still syncing, not yet to minValid optime 50f6472f:5d"
}
],
"ok" : 1
}
How should I fix this problem?
Upvotes: 7
Views: 18437
Reputation: 41
I've fixed the issue by following the below procedure.
Login to different node and remove the issue node from mongodb replicaset. eg.
rs.remove("10.x.x.x:27017")
Stop the mongodb server on the issue node
systemctl stop mongodb.service
Create a new new folder on the dbpath
mkdir /opt/mongodb/data/db1
Note : existing path was /opt/mongodb/data/db
Modify dbpath on /etc/mongod.conf or mongdb yaml file
dbPath: /opt/mongodb/data/db1
Start the mongodb service
systemctl start mongodb.service
Takebackup of the existing folder and remove it
mkdir /opt/mongodb/data/backup
mv /opt/mongodb/data/db/* /opt/mongodb/data/backup
tar -cvf /opt/mongodb/data/backup.tar.gz /opt/mongodb/data/backup
rm -rf /opt/mongodb/data/db/
Upvotes: 4
Reputation: 55
Check mongodb documentation for this issue https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/#replica-set-auto-resync-stale-member
Upvotes: -2
Reputation: 652
This will happen if replication has been broken for a while and on the slave it's not enough data to resume replication.
You would have to re-sync the slave either by replicating data from scratch or by copying it from another server and then resume it.
Upvotes: 1
Reputation: 1867
I had exact same issue: Secondary member of replica stuck in recovering mode. Here how to solve the issue:
It will start in startup2 mode and will replicate all data from Primary
Upvotes: 9