Gregory Stein
Gregory Stein

Reputation: 323

Three datacenters and mongoDB replica set with a requirement of local availability to read

We have three datacenters which are connected via private network (not connected to Internet). We use MongoDB to store data. One datacenter is a backup of another and for this we configured a replica set. All writings are done in a specific datacenter, say DC1.

DC1 (primary) <-- Here is running service which writes data into DB
DC2 (secondary) <-- only reads here
DC3 (secondary) <-- only reads here

Now consider a case when the connection to one of the datacenters fails, for example DC2 and it becomes isolated:

Network 1:
DC1 (primary) <-- Here is running service which writes data into DB
DC3 (secondary) <-- only reads here

Network 2:
DC2 (secondary) <-- only reads here

We do not have majority (50%+1 nodes) on its side, but we still need it to function for reads only (mostly by services running on the same node as the secondary) until the network is restored. This fail-tolerance should be done automatically. The fact there is no majority of secondaries available makes it unable to become primary (which is correct), but we need to keep using it for read only operations.

I know there are special secondaries like arbiter that can be added but no amount of them will solve the problem because we need the same behavior in case of network failure for DC3 as well.

We are struggling to find a compatible configuration of replica set for our case. Looks like the whole replica set as provided by MongoDB doesn't fit our requirement. Maybe there is another mechanism that could be applied here?

Will really appreciate any help. If I missed some data you need to answer this question, please write in the comments.

MongoDB: 7.0

Upvotes: 0

Views: 34

Answers (1)

Wernfried Domscheit
Wernfried Domscheit

Reputation: 59557

You situation really looks like the standard setup for a Replica Set.

Just give a higher priority on DC1 than the other have, that's it.

members: [
  {_id: 0, host: 'host1.dc1:27017', priority: 10},      
  {_id: 1, host: 'host1.dc2:27017', priority: 1},
  {_id: 2, host: 'host1.dc3:27017', priority: 1}
]

In case DC2 is not reachable by the other members, then 2 out of 3 are still reachable, i.e. DC1 remains as primary and your entire database is fully available. When you connect to DC 2 directly, then you can still read from it.

Note, in order to elect a primary the majority of all members must be available - or more precisely the majority of all voting members.

Some more information:

  • The majority on a side does not matter. Relevant is the number of available and reachable members in entire Replica Set. No matter if they run in different datacenter or even if they run all on a single machine (which does not make much sense for production database).
  • You cannot control by Replica Set configuration from where the client reads the data. You define this only on the client side.
  • By default, you cannot define from which secondary the client reads the data - unless you work with Replica Set Tag Sets but for only three members it does not make much sense.

Upvotes: 0

Related Questions