Reputation: 13
I am new to PyFlink and I have a kafka stream which has phone_number, host_name and event_time all in string formats. How can I compute number of visits for each pair phone_number, host_name, during last 24 hours using DataStreams API of pyflink?
I tried viewing examples on official GitHub, but I don't understand weather I need to define watermark strategy or not
Upvotes: 0
Views: 54
Reputation: 43697
Watermarks are necessary if both of these conditions hold:
So in your case, you will need to define a watermark strategy if you are using event time rather than processing time.
Upvotes: 0