Reputation: 3353
I have replica set (hosted on amazon) which has:
All of them are version 3.2.6 and this replica is making one shard in my sharded cluster (if that is important although I think it is not).
When I type rs.status()
on primary it says that cannot reach secondary (the same thing is on arbiter):
{
"_id" : 1,
"name" : "secondary-ip:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2016-07-20T15:40:50.479Z"),
"lastHeartbeatRecv" : ISODate("2016-07-20T15:40:51.793Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "Couldn't get a connection within the time limit",
"configVersion" : -1
}
(btw look at the optimeDate O.o)
Error in my log is:
[ReplicationExecutor] Error in heartbeat request to secondary-ip:27017; ExceededTimeLimit: Couldn't get a connection within the time limit
Strange thing is that when I go on secondary and type rs.status()
everything looks OK. Also I am able to connect to secondary from my primary instance (with mongo --host secondary
) so I guess it is not network issue. Yesterday it was all working fine.
TL;DR my primary cannot see secondary and arbiter cannot see secondary and my secondary sees primary and it was all working fine just day ago and I am able manually connect to secondary from primary instance.
Anyone has an idea what could go wrong?
Tnx, Ivan
Upvotes: 6
Views: 7947
Reputation: 648
It seems the secondary optimeDate is responsible for the error, the best way to get to know the reasons for this wrong optimeDate is to investigate the secondary's machine current date time as it could be wrong as well. Not sure you are still looking for an answer but the optimedate is the problem and its not the connection between your replicaset machines.
Upvotes: 0