Reputation: 533
We're consuming data from a compacted kafka topic using upsert-kafka
connector and converting the same to a DataStream using .toChangelogStream
. We have another DataStream from source kafka created using kafka
connector. Now, we are creating Keyed streams from both and connecting them using a KeyedCoProcessFunction
implementation.
In Flink UI, we see huge number of records in the source kafka operator, even though we do not have those many records being published to that topic. When I avoid .connect
I could see less records in the source operator.
The source stream has ~50K records and the compacted stream has ~200 records. In the KeyedCoProcessFunction
we are looking for a common key and emitting the required records.
What am I missing here? Can someone please help on this?
Upvotes: 0
Views: 116
Reputation: 533
There was no issue with the source or with the code. Flink UI has combined the operations and showed it as one operator. I updated the parallelism for few operations and then UI has shown the expected numbers.
Upvotes: 0