Yair Cohen
Yair Cohen

Reputation: 507

Apache Flink - Compute the last window on event time based

My job does the following things:

  1. Consumes events from Kafka topic based on event time.
  2. Computes a window size of 7 days and in a slide of 1 day.
  3. Sink the results to Redis.

I have several issues:

  1. In case it consumes Kafka events from the lastest record, after 1 day the job is alive, the job closes the window and computes 7 days window. The problem is that the job has only data for 1 day and hence the results are wrong.
  2. If I try to let it consumes the Kafka events from a timestamp of 7 days ago, as the job starts, it calculates the whole windows from the first day, and it took a lot of time. Also, I want just the last window results because this is what matters for me.

Have I missed something? Is there a better way to do that?

Upvotes: 0

Views: 796

Answers (1)

David Anderson
David Anderson

Reputation: 43499

Flink aligns time windows to the epoch. So if you have windows that are one hour long, they run from the top of the hour to the top of the hour. Day long windows run from midnight to midnight. The same principle applies to windows that are seven days long, and since the epoch began on a Thursday (Jan 1, 1970), a window that is seven days long should close at midnight on Wednesday night / Thursday morning.

You can supply an offset to the window constructor if you want to shift the windows to start at a different time.

Upvotes: 3

Related Questions