Reputation: 349
We are using Flink 1.9.1.
We have a source, a process function, and a sink. The application consumes and produces to kinesis.
The input rate (produced by a simulator) is 20 events per second. The per second output rate for the process function shows 14 per second. The back pressure metrics for the source is shown as OK (green). The event count (Number of events sent by the source) and the number of events received by the process function also match with very little delay.
But this count does not match the event count pushed by the simulator. This count matches the 14 per second rate.
Now my question is, does Flink regulate the input rate automatically? In my case, how is the input rate controlled at 14 per second? If it is not, is there any other metric that I should be looking at that I'm missing?
Upvotes: 0
Views: 1013
Reputation: 43499
It's not possible to force a Flink pipeline to consume events at a particular rate. By design, there is limited buffering in the network stack, and the slowest task in the execution graph will dictate the rate at which the pipeline will consume and process events.
The back pressure monitoring (that green OK signal) is not a definitive guide to whether back pressure is occuring. So long as the job is able to make steady forward progress, it probably won't indicate that there's a problem. You could examine some of the network queue metrics to get more insight: e.g., inPoolUsage
, outPoolUsage
, inputQueueLength
. See Flink Network Stack Vol. 2: Monitoring, Metrics, and that Backpressure Thing for a lot more on this topic.
20 events per second seems very slow, so I am a bit surprised that something can't keep up with that rate, but that appears to be what's happening.
Upvotes: 2