Reputation: 582
I have multiple (3 to be precise as of now) streams (of different types) from different kafka topics. They have a common property userId
. All I want to do now is to partition by userId and then add some business logic to it. How can I partition by userId all streams and ensure that all the events go to the same task processor so that userId state is accessible ?
I could have used ConnectedStream but here the usecase is for more than 2 different kind of streams.
Also I was wondering weather something like this would guarantee same task processor
MyBusinessProcess businessProcess() = new MyBusinessProcess();
streamA.keyBy(event -> event.userId).process(businessProcess);
streamB.keyBy(event -> event.userId).process(businessProcess);
streamC.keyBy(event -> event.userId).process(businessProcess);
Edit: I just realised that for businessProcess, how would it differentiate between which event is coming in if there are stream of multiple types. Gets me thinking more since this seems like a naive streams problem.
Thanks.
Upvotes: 1
Views: 649
Reputation: 9245
I would create a class (let's call it Either3
) that has a userID field, and then three additional fields (only one of which is ever set) that contain your three different stream's data type (look at Flink's Either
class for how to do this for 2 values).
Then use a map function on each of your three streams to convert from class A/B/C to an Either3
with the appropriate value set.
Now you can .union()
your three streams together, and run that one stream into your business process function, which can maintain state as needed.
Upvotes: 3