Shashi
Shashi

Reputation: 2714

Spark Streaming with Nifi

I am looking for way where I can make use of spark streaming in Nifi. I see couple of posts where SiteToSite tcp connection is used for spark streaming application, but I think it will be good if I can launch Spark streaming from Nifi custom processor.

PublishKafka will publish message into Kafka followed by Nifi Spark streaming processor will read from Kafka Topic.

I can launch Spark streaming application from custom Nifi processor using Spark Streaming launcher API, but the biggest challenge is that it will create spark streaming context for each flow file, which can be costly operation.

Does anyone suggest storing spark streaming context in controller service ? or any better approach for running spark streaming application with Nifi ?

Upvotes: 3

Views: 884

Answers (2)

Mike Thomsen
Mike Thomsen

Reputation: 37526

I can launch Spark streaming application from custom Nifi processor using Spark Streaming launcher API, but the biggest challenge is that it will create spark streaming context for each flow file, which can be costly operation.

You'd be launching a standalone application in each case, which is not what you want. If you are going to integrate with Spark Streaming or Flink, you should be using something like Kafka to pub-sub between them.

Upvotes: 0

Ajay Ahuja
Ajay Ahuja

Reputation: 1323

You can make use of ExecuteSparkInteractive to write your spark code which you are trying to include in your spark streaming application.

Here you need few things setup for spark code to run from within Nifi -

  1. Setup Livy server
  2. Add Nifi controllers to start spark Livy sessions.

    LivySessionController

    StandardSSLContextService (may be required)

Once you enable LivySessionController within Nifi, it will start spark sessions and you can check on spark UI if those livy sessions are up and running.

Now as we have Livy spark sessions running, so whenever flow file move through Nifi flow, it will run spark code within ExecuteSparkInteractive

This will be similar to Spark streaming application running outside Nifi. For me this approach is working very well and easy to maintain compare to having separate spark streaming application.

Hope this will help !!

Upvotes: 0

Related Questions