TheM00s3
TheM00s3

Reputation: 3711

Replica set members not resyncing after primary goes down

I have a replica set with 3 nodes, I have a server titled dev-6 which is running mongo 3.0.6, and dev 5 which has 2 mongos on it running 3.2. I'd like for dev 6 to be the that is the primary, and so I've added the other 2 nodes to its initiated replica set, once I do that it becomes primary and the other 2 nodes begin to sync to it. Here is a screenshot of how my setup looks like when I bring down dev 6, and then is brought back up.

enter image description here

As, intended dev 6 is secondary, and so is dev 5: 27018. What I'm wondering about though is why is dev 5 saying there's no one to sync with, but dev 5:27019 is saying that its syncing with dev 5 :27018.

Im now going to follow the mongo instructions to make dev 6 the primary, here is the result now.

enter image description here

Dev 6 is the primary, but what Im trying to understand is how come the other dev 5 instances are not connecting with dev 6. Before some conclusions are jumped to, I am able to ping dev 5 from dev 6 and visa versa, the /etc/hosts profiles contain the ip addresses for one another.

EDIT: Im basing the replica set being unable to connect with the following message "lastHeartbeatMessage" : "could not find member to sync from",. This seems to be fixed if I run rs.config(//current cfg) or if I add or remove a replica set.

Upvotes: 0

Views: 2303

Answers (2)

Andriy Simonov
Andriy Simonov

Reputation: 1288

Your replica set seems to be healthy in both cases. All secondaries have applied the last operation from the primary's operation log (optime/optimeDate are the same), moreover lastHeartbeat is slightly behind the dev 6 time. In regard to the lastHeartbeatMessage refer this jira issue, that says:

When secondary choose a source to sync, it will choose a node who's oplog is newer (not equal) than self, so after startup,when all nodes have some data,the oplog will be same,so secondary cannot choose a sync souce, write after a write operation happens, primary will have newer oplog,secondary can successfully choose a targe to sync from,the error message will disappear.

Upvotes: 1

Richard Carter
Richard Carter

Reputation: 11

The error "could not find member to sync from" I usually associate with replica set members not being able to talk to one another. Either because of firewall or credential issues.

I know that you can ping the servers but have you tried connecting to the primary mongo instance from one of the secondaries using the mongoclient?

mongo vpc-dev-app-06:27017

with appropriate user credentials if necessary.

Has anything possibly changed in the mongod.conf as part of the upgrade?

Upvotes: 0

Related Questions