user3045265
user3045265

Reputation: 43

Configuration of workers in a storm cluster

I have a question about configuration of worker processes.

I have already understood that worker processes run within a worker node (which is a machine). What I would like to know is if all worker processes share the same JVM or if each worker process has its own JVM instance? If the latter is true, so I suppose one should set how much memory each process has to use! So where would this configuration be done?

Upvotes: 4

Views: 11127

Answers (4)

abhi
abhi

Reputation: 4792

  • A worker process executes a subset of a topology, and runs in its own JVM.
  • A worker process belongs to a specific topology and may run one or more executors for one or more components (spouts or bolts) of this topology.
  • An executor is a thread that is spawned by a worker process and runs within the worker’s JVM.
  • An executor may run one or more tasks for the same component (spout or bolt).
  • An executor always has one thread that it uses for all of its tasks, which means that tasks run serially on an executor.

So it can be said that all the worker processes will run on the same JVM only belonging to a specific worker node.

Upvotes: 2

Balkrishan Aggarwal
Balkrishan Aggarwal

Reputation: 613

Each storm worker process run in its own JVM. The memory allocation to each of the worker can be done in the conf/storm.yaml configuration file. For Ex: Add/Edit the following parameter to allocate 1GB RAM to each of your worker process:

worker.childopts: "-Xmx1024m"

This overrides any JVM RAM settings you have done in general on your machine (like using JAVA_TOOL_OPTIONS)

For more storm configurations refer: Storm Configurations

Upvotes: 6

kartik
kartik

Reputation: 2135

configure worker.childopts = "-Xmx4048m" in storm.yaml.

each worker process will get that much memory from RAM, if available.

Upvotes: 3

user2720864
user2720864

Reputation: 8161

Each worker node runs independently on its own JVM. But they can run one or more worker process for one or more topologies.

If the latter is true, so I suppose one should set how much memory each process has to use! So where would this configuration be done?

If you intent to set the JVM params then follow the discussion here

Upvotes: 0

Related Questions