Reputation: 31
I am using spark and scala and I would like to create a window operation with length set in number of objects i.e. the window starts empty, as the stream initiates the objects are stored in the window up until it holds 10 objects and when the 11th comes the first is dropped.
Is this possible or do I have to use an other structure like a list or array? The documentation (http://spark.apache.org/docs/latest/streaming-programming-guide.html#window-operations) and some googling only refer to a time based window (length and interval).
Thank you in advance.
Upvotes: 1
Views: 154
Reputation: 185
Window in Spark streaming is characterized by windowDuration
and slideDuration
(optional). So, it is a time window. But you can consider using Apache Flink. It supports both count windows and time windows. But in comparison to Spark, Flink has another streaming ideology. It process incoming events as they arrive (Spark processes events in micro-batches). As a result, Flink may have some restrictions. Give it a try if it suits your needs.
Upvotes: 2