jossdb
jossdb

Reputation: 43

Get all events with specific key from Kafka

I have topics in my system that store events for given entities. Now I would like to perform some analysis on the eventlog. Therefore I need to query all events belonging to some entity within a certain time period. Is there a possibility to aggregate all events of a certain key within a TimeWindow using Kafka Streams?

Upvotes: 1

Views: 2913

Answers (2)

Matthias J. Sax
Matthias J. Sax

Reputation: 62285

It really depends how you want to setup your system, what kind of analysis you want to do and what you exactly mean by "querying"

For a one-time analysis, you might just want to do a stream.transform(...).to() and filter on key and timestamp (context.timestamp() is your friend) in your Transformer and write the result into a topic. Hence, you would run this program once for some key and time range. Maybe you could even do the required analysis before writing any result, you you use WindowStore (with duplication enabled) to buffer all data in the store).

If you want to write a program that prepares **all* data for analysis, you would use a groupBy() (or grouByKey()). Using windowedBy() with TimeWindows only works of you know the time ranges you want to group the data upfront (eg, hourly, or daily, or similar). For the aggregation itself, you could return a List<Value> object and accumulate the corresponding records per key and window. This way, you can use IQ to get all records for a specif key and window with a single lookup.

Upvotes: 0

OneCricketeer
OneCricketeer

Reputation: 191671

Sounds like you just want the groupByKey method of the DSL

Upvotes: 0

Related Questions