Sergey Gavruk
Sergey Gavruk

Reputation: 3568

MongoDB Config Servers - read only

I wonder about this statement about config servers in mongodb (from documentation):

If any of the config servers is down, the cluster's meta-data goes read only. However, even in such a failure state, the MongoDB cluster can still be read from and written to.

We can use 1 or 3 config servers. Why if we use 3 config servers and one server is down, cluster goes to read-only mode? As you can see from the link above, Each config server has a complete copy of all chunk information.
If each sesrver have a complete copy of all chunk information, why does it goes read only after one config server is down?

Upvotes: 0

Views: 1379

Answers (1)

ChadsworthIII
ChadsworthIII

Reputation: 106

So, the reason for this is the way that config servers do their 2 phase commits. While if you have one config server and it fails, then your whole system fails. If you have 3 and one fails all of the metadata is still available, but you lose the resiliency factor of 2 phase commits. You cannot do 2 phase commits without 3 members.

So you may still run off of the other two for reads, but the balancer is essentially turned off so that no chunk migration or splits happen (hence metadata becomes read only). This is because you cannot commit splits or migrations using the commit process a 3 node config setup uses, so they don't happen.

Running with 1 config server is not recommended. Basically if it goes down, you don't know where any of your data is.

A 2 phase commit only works with 3 machines because it can make sure your data stays in a consistant state. It means that if a machine died in the middle of an update, that update will either fail or persist depending on whether it was committed to at least another node which will update the third, (hence 2 phase commit). So it is safe to read a sharded cluster using the 2 config servers left.

You can't do that with 2 nodes. It might have gone through, it might have not, you cant tell because you cannot compare the last remaining node to anything since the other one is down. So the safe thing to do is not take any updates until you get the third node back up, otherwise you may be reading out of date data.

If you want seamless failure resistance, then it doesn't make sense to use 2 due to why you use the 2 phase commit. It really has no more durability then 1 node if you would rather have nothing then potentially incorrect data. And in a sharded cluster, nothing and incorrect data go hand in hand since either way you don't know where to find your chunks.

Its basically done to protect you from potential config data corruption and inconsistencies.

Upvotes: 2

Related Questions