How does reduceByKeyAndWindow works in Spark streaming?

Question

I am learning spark streaming and find out some hash tags from some production logs.

In some examples, i found the following code:

val words = statuses.flatMap(line => line.split(" "))

val tags = words.filter(w => w.startsWith("#"))

val tagKeyValues = tags.map(tag => (tag, 1))

val tagCounts = tagKeyValues.reduceByKeyAndWindow( (x,y) => x + y, (x,y) => x - y, Seconds(300), Seconds(1))

The code is working fine. But i did not understand how this reduceByKeyAndWindow is working here? Why are we decrementing the values in its second argument?

How does reduceByKeyAndWindow works in Spark streaming?

Answers (1)

Related Questions