Siddharth Mehrotra
Siddharth Mehrotra

Reputation: 43

Testing MongoDb Failover using Primary-Secondary-Arbiter architecture

I have a replica set with 3 nodes that use primary-secondary-arbiter architecture that faced read staleness when the primary node went down due to some issues. I'm trying to figure out what could have caused this issue, since the replica set should automatically elect the secondary node as primary after the election.

I saw this warning in mongodb ops manager- This replica set has a Primary-Secondary-Arbiter architecture, but

readConcern:majority 

is enabled for this node. This is not a recommended configuration.

I'm trying to reproduce the problem by setting up a replica set locally, and do reads with/without read concern majority while emulating a primary node down scenario.

Steps I followed-

  1. Setup replica set locally.
  2. Create a database.
  3. Add records in a collection.
  4. C# program in VS that would connect to the replica set.
  5. As soon as the connection takes place I put the thread in a sleep state for 10 secs, where I would kill the primary node using windows task manager (expecting failover would take place).
  6. Query the db with/without read concern majority.

Result- Both time I get a timeout.

What am I doing wrong? How can I reproduce the problem and make changes to fix it?

Upvotes: 0

Views: 151

Answers (1)

As mentioned in the documentation for Mitigate performance issues in PSA "Using a PSA environment could lead to lag of the commit point as well as can cause serious performance issues.

Also, as described in the documentation:

In earlier versions of MongoDB, enableMajorityReadConcern and --enableMajorityReadConcern were configurable allowing you to disable the default read concern "majority" which had a similar effect.

Could you help me understand what is the MongoDB version you are using ?

Finally, the steps to avoid the timeout and lag of the majority commit point are described below:

cfg = rs.conf(); cfg["members"][<array_index>]["votes"] = 0; cfg["members"][<array_index>]["priority"] = 0; rs.reconfig(cfg);

Please reach out in case of further questions.

Regards

New response:

I apologise I misunderstood your question here.

I see the steps you are following are correct and help you reproduce the error. Having readConcern:majority should not be satisfied until a new primary is elected and majority of nodes have acknowledged the writes.

With readConcern: majority:

You should see a timeout or an error because the remaining secondary node cannot acknowledge the read operation with a majority until it is elected as the new primary.

Without readConcern: majority:

Adjust the C# program to use a different read concern (e.g., local) and observe the behaviour. The reads might still succeed because they don't require acknowledgment from the majority of nodes.

Also, could you help me with the error message that you are seeing?

Upvotes: 0

Related Questions