Akhil
Akhil

Reputation: 646

Apache NiFi - "Execution" Option

Can someone tell me what is the use of "Execution" option in Apache NiFi ?

This option is available for most of the processors and there are 2 values we can choose currently - Primary Node & All Cluster.

One use case I can think of is - reading data from Kafka and you want to load balance the read.

May be I got confused with the use of this option as it is provided for most of the processors.

Upvotes: 1

Views: 1128

Answers (1)

Bryan Bende
Bryan Bende

Reputation: 18670

Primary Node Only is for the case where a source processor should only execute on one node. For example, if you had a GetSFTP processor at the start of your flow in a 3 node cluster, you wouldn't want this to run on all 3 nodes because they would all get the same files.

The most common use of primary node only is probably the List + Fetch pattern. The flow starts with a List processor like ListHDFS which runs on Primary Node Only, followed by a load balanced connection to distributed the listings to all nodes, connected to FetchHDFS running on all nodes.

https://pierrevillard.com/2018/10/29/nifi-1-8-revolutionizing-the-list-fetch-pattern-and-more/

Processors that are not the first processor in the flow should never really be set to primary node only, even though the application allows it. This can probably be improved.

Upvotes: 4

Related Questions