Prashant_J
Prashant_J

Reputation: 354

Spark Streaming Actions and transformation

In the below code statement 5 is not giving the count of lines but statement 4 is printing the lines why?
How many action operations get performed in spark streaming?
Is statement 6 is not executed?

1) val conf =new SparkConf().setMaster("local").setAppName("learn")
2) val ssc = new StreamingContext(conf,Seconds(10))    
3) val lines  =ssc.socketTextStream("localhost",1234)
4) lines.print()   
5) lines.count()
6) ssc.start()
7) ssc.awaitTermination()

Upvotes: 0

Views: 811

Answers (1)

Hokam
Hokam

Reputation: 924

Please find the answers for your questions:

Why Statement 5 is not giving the count of lines in output but statement 4 is printing the lines why?

In statement 5 you are just calling a count on DStream. Count is a transformation that gets applied on DStream. It return a new DStream of single-element RDDs by counting the number of elements in each RDD of the source DStream.
If you want to print out the count of lines then you should use lines.count().print().
Statement 4 is giving output because print call is an action which prints the results.

How many action operations get performed in spark streaming?

In your above code only one action gets performed which is print(), applied in statement 4. Generally, arbitrary actions can be performed in the same task.

Is statement 6 is not executed?

Statement 6 gets executed, starts the streaming/spark context and the receiver starts receiving the data from sources.

Upvotes: 1

Related Questions