Reputation: 8950
The question might seem ridiculous but it seems to me that a "yes" would be a little crazy.
MongoDB suggests to have replication sets of 3 machines. So if the database can stand on 1 computer, I need 3 machines, and if tomorrow I need to shard and need 2 machines I will actually need 6, right ?
Or is there something smarter that can be done and that comes for free with mongoDB ? (with coding theory like Hamming, ... the number of extra bits that we need is not linear in the size of the total number of bits)
Please don't hesitate to ask me to reformulate if what I say is not clear
Thanks in advance for your answers,
Thomas
Upvotes: 0
Views: 94
Reputation: 3150
So there is some really good documentation which is the recommended cluster setup in terms of phisycal instance separation. There should be considered two things (at least) separately. One is replication and for this one see this documentation : http://docs.mongodb.org/manual/core/replica-set-members/
Which means you have to have at least two data nodes (due to HA) in a replicaset and can have one arbiter which is not holding data just participate in election as it is described in the docs linked above. You need an odd number of setmembers due to the primary has to be elected by a majority inside the replicaset.
The other aspect is sharding. Sharding needs some additional metadata maintaining layer which is achived through additional processes these are configuration servers and mongos routers. For sharded production cluster see : http://docs.mongodb.org/manual/core/sharded-cluster-architectures-production/. In this setup the three configservers have to be on separated instances. Also the two mongos processes cannot reside on the same instance.
So for the minimal alignment. Have to be considered :
So theoretically one can align a sharded cluster without braking any of the recomendations on 4 instances with two shards like this:
Instance 1: datanode replicaset 1, configserver 1, arbiter replicaset 2
Instance 2: datanode replicaset 1, configserver 2, mongos 1
Instance 3: datanode replicaset 2, configserver 3, arbiter replicaset 1
Instance 4: datanode replicaset 2, mongos 2
Where replicaset 1 represents the first shard and replicaset 2 represents the second.
datanode is not a terminology which is used for mongoDB in general just i am likely to address with this name those mongod process which are handling real data, so the (Primaries and secondaries in a replicaset). Just as a sidenote i would not do this. Just start micro instances for the configservers and keep mongos processes on the application servers.
Upvotes: 2