Reputation: 354
In the below code statement 5
is not giving the count of lines but statement 4
is printing the lines why?
How many action operations get performed in spark streaming?
Is statement 6
is not executed?
1) val conf =new SparkConf().setMaster("local").setAppName("learn")
2) val ssc = new StreamingContext(conf,Seconds(10))
3) val lines =ssc.socketTextStream("localhost",1234)
4) lines.print()
5) lines.count()
6) ssc.start()
7) ssc.awaitTermination()
Upvotes: 0
Views: 811
Reputation: 924
Please find the answers for your questions:
Why
Statement 5
is not giving the count of lines in output butstatement 4
is printing the lines why?
In statement 5
you are just calling a count on DStream. Count is a transformation that gets applied on DStream. It return a new DStream of single-element RDDs by counting the number of elements in each RDD of the source DStream.
If you want to print out the count of lines then you should use lines.count().print()
.
Statement 4
is giving output because print call is an action which prints the results.
How many
action operations
get performed in spark streaming?
In your above code only one action gets performed which is print(), applied in statement 4
. Generally, arbitrary actions can be performed in the same task.
Is
statement 6
is not executed?
Statement 6
gets executed, starts the streaming/spark context and the receiver starts receiving the data from sources.
Upvotes: 1