sahar shokouhi
sahar shokouhi

Reputation: 701

Which class in storm does instantiate the number of threads for each bolt and spout?

I need to know how Storm manages number of parallel worker in each bolt. neither IrichBolt class nor IRichSpout Class implements Runnable class. I really need to know how storm manage multithreading?

Upvotes: 3

Views: 814

Answers (1)

user2720864
user2720864

Reputation: 8171

Its kinda too broad to discuss but here's something I could try to share. In very brief Spouts or Bolts in storm can be defined as an entity or component that actually process the data . In storm terminology they are known as tasks(so you don't need its parent interface such as IRichSpout to implement something like Runnable ). Now the Thread which in responsible for carrying out these tasks are called Executors. From the doc

in Storm’s terminology "parallelism" is specifically used to describe the so-called parallelism hint, which means the initial number of executor (threads) of a component (spout or bolt)

These executors (threads) are again spawned by the worker process . From the doc

A worker process executes a subset of a topology. A worker process belongs to a specific topology and may run one or more executors for one or more components (spouts or bolts) of this topology

A machine in a storm cluster may run single or multiple such worker process for one or more topologies, and each process can run executors for specific topologies ( you can even change these executors during run time using the storm re-balancing mecanism). For internal communication with in these workers process Storm uses various message queues backed by LMAX Disruptor . They maintain their own threads like receiver thread & sender thred for managing incoming and outgoing messages.

You can probably take look in this doc page for a better overview. And this very nice article explaining how it handles parallelism. This might help you digging further and share your findings :)

Upvotes: 5

Related Questions