Reputation: 420
I'm trying to read data from a kafka stream, process it and save it to a report. I'd like to run this job once a day. I'm using dStreams. Is there an equivalent of trigger(Trigger.Once) in dStreams I could use for this scenario. appreciate suggestions and help.
def main(args: Array[String]) {
val spark = ...
val ssc = new StreamingContext(sc, Seconds(jobconfig.getLong("batchInterval")))
val kafkaStream =
KafkaUtils.createDirectStream[String, String](ssc, LocationStrategies.PreferConsistent, ConsumerStrategies.Subscribe[String, String](Array(jobconfig.getString("topic")), kafkaParams))
kafkaStream.foreachRDD(rdd => {
.
.
.
.
}
sqlContext.clearCache()
ssc.start()
ssc.awaitTermination()
}
Upvotes: 0
Views: 160
Reputation: 2518
Anil,
According to the documentation you should use the spark.streaming.stopGracefullyOnShutdown
parameter : https://spark.apache.org/docs/latest/configuration.html
It says :
If true, Spark shuts down the StreamingContext gracefully on JVM shutdown rather than immediately.
Upvotes: 0