Anil
Anil

Reputation: 420

How to gracefully stop a spark dStream process

I'm trying to read data from a kafka stream, process it and save it to a report. I'd like to run this job once a day. I'm using dStreams. Is there an equivalent of trigger(Trigger.Once) in dStreams I could use for this scenario. appreciate suggestions and help.

def main(args: Array[String]) {
val spark = ...
val ssc =  new StreamingContext(sc, Seconds(jobconfig.getLong("batchInterval")))
       val kafkaStream =
      KafkaUtils.createDirectStream[String, String](ssc, LocationStrategies.PreferConsistent, ConsumerStrategies.Subscribe[String, String](Array(jobconfig.getString("topic")), kafkaParams))
      kafkaStream.foreachRDD(rdd => {
.
.
.
.
}
    sqlContext.clearCache()
    ssc.start()
    ssc.awaitTermination()
}

Upvotes: 0

Views: 160

Answers (1)

baitmbarek
baitmbarek

Reputation: 2518

Anil,

According to the documentation you should use the spark.streaming.stopGracefullyOnShutdown parameter : https://spark.apache.org/docs/latest/configuration.html

It says :

If true, Spark shuts down the StreamingContext gracefully on JVM shutdown rather than immediately.

Upvotes: 0

Related Questions