Reputation: 758
I am new to Apache Storm, and I am trying to figure for myself about configuring storm parallelism. So there is a great article "Understanding the Parallelism of a Storm Topology", but it only arouses questions.
When you have a multinode storm cluster each topology is distributed as a whole according to TOPOLOGY_WORKERS
configuration parameter. So if you have 5 workers, then you have 5 copies of spout (1 per worker), and the same thing is with bolts.
How to deal with situation like this inside a storm cluster (preferably without creating external services):
Upvotes: 0
Views: 1212
Reputation: 7056
First, the basics:
Second, a correction... having 5 workers does NOT mean you will automatically have 5 copies of your spout. Having 5 workers means you have 5 separate JVMs where storm can assign executors to run (think of this as 5 buckets).
The number of instances of your spout is configured when you first create and submit your topology:
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("0-spout", new MySpout(), spoutParallelism).setNumTasks(spoutTasks);
Since you want only one spout for the entire cluster, you'd set both spoutParallelism
and spoutTasks
to 1.
Upvotes: 2