Reputation: 3855
I have a setup 72 machines which take jobs from RabbitMQ queues and perform CRUD operations on a sharded MongoDB over 4 machines and an Arbeiter, i want to know if it makes a difference/sense to let only one machine handle the CRUD operations, everything is running on the same network.
on a side note the reason i am thinking about this is because it makes tracking progress easier.
Upvotes: 0
Views: 83
Reputation: 151072
Four instances would be light for a sharded solution, perhaps you are talking about Replication?
In MongoDB the concept of replication is used for high availability, not for performance. A minimum recommended configuration would be three nodes, one primary, one secondary and an arbiter. The purpose of the arbiter being to handle elections over which node is the Primary, essentially breaking the deadlock on two nodes by handing the majority of votes to one. More stable replica sets have at least three nodes of Primary, Secondary, Secondary..., and possibly an arbiter, where the arbiter is again primarily used to have an odd number of votes giving a majority to one node in an election.
In a Sharded configuration, a collection is split by a determined shard key across the shards in the cluster. The main purpose of sharding is when your data has a working set that is larger than the available RAM on a single node. It allows queries to be distributed across the cluster or targeted to the shard that contains the required data, in any case distributing the load or not tying up the resources of any given node.
Sharding also requires additional nodes known as config servers. These nodes hold the meta data regarding the location of data on the shards, and interrogated by the router to direct requests to each shard. In production you will have three config servers to protect against one going down.
In any case, sharded clusters are typically comprised of a replica set within each shard. From the replica set, only the primary node accepts connections for read/write operations. You can set up connections to allow reads from secondary nodes if you can live with the data possibly being inconsistent, but only one node will every accept writes, then replicating to the other secondary nodes. With shards, writes will be sent to the primary of the shard or shards depending on the shard key chosen.
So the number of servers involved in CRUD operations depends entirely on your configuration.
Upvotes: 1
Reputation: 404
Can you provide more details on your question? So you have 72 machines to crunch the RabbitMQ queue and 4 way sharded Mongo cluster to host the data? Have you done any analysis to understand what is the bottleneck on the system. Is it the rabbitmq processing or the CRUD on the DB? Also once you have sharded, if your keys are uniformly distributed the CRUD operations will get distributed over all your mongo shards.
Upvotes: 0