carlosalbertomst
carlosalbertomst

Reputation: 513

Distributed system design

In a distributed system, a certain node distributes 'X' units of work equally across 'N' nodes (via socket message passing).

As we increase the number of worker nodes, each nodes completes his job faster but we have to set-up more connections.

In a real situation, it would be similar to changing 10 nodes in a Hadoop-like system with each node processing 100GB by 1,000,000 nodes with each node processing 1MB.

Upvotes: 2

Views: 1088

Answers (3)

Nauman
Nauman

Reputation: 299

Does it have to use sockets and message passing between Supervisor and Worker?

You can use some type of queuing so avoid putting load onto the Supervisor. Or a distributed file system similar to HDFS to distribute the tasks and collect the results.

It also depends on the number of nodes you are planning to deploy the Workers on. 1,000,000 nodes is a very big number therefore in that case, you'll have to distribute the tasks into multiple queues.

The thing to be careful about is what will happen if all the nodes finish their tasks at the same time. It would be worth putting some variability into when they can request for a new task. ZooKeeper (http://hadoop.apache.org/zookeeper/) is potentially something you can also use to synchronise the jobs.

Upvotes: 1

willemIP
willemIP

Reputation: 1

Can you measure your network cost? The time spent working on the worker machine should be only part of the cost of the message pass and receive.

Also can you describe the O notation for handling each worker result into the master result?

Does your master round robin expected responses?

btw -- if your worker nodes are finishing quicker but underutilizing the cpu resources you may be missing a design trade-off?

of course, you could be the rule or the exception to any law(argument/out of date research). ;-)

Upvotes: 0

Brent Arias
Brent Arias

Reputation: 30215

Sounds like you will need to consult Amdahl's Law.

At least it was how I computed how many machines on a high-speed switch were optimal for my parallel computations.

Upvotes: 3

Related Questions