Somasundaram Sekar
Somasundaram Sekar

Reputation: 5524

Usecases for GroupCombine in flink

Can someone provide throw some light on practical usecases of GroupCombine of Grouped Dataset in Apache flink.

Ref: https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/batch/dataset_transformations.html#groupcombine-on-a-grouped-dataset

Upvotes: 0

Views: 166

Answers (1)

Tanmay Deshpande
Tanmay Deshpande

Reputation: 509

GroupCombine is used for optimization purposes. Unlike GroupReduce, it does not do any data shuffling but only works on individual partitions. This helps in reducing the data to be sent to next reduce operation. In simple words, it is a Local Reduce operation.

If you are familiar with Map Reduce functions in Hadoop, We have combiner operation there as well. This GroupCombine in Flink works exactly in the same way.

Here is a visual representation of Combiner in Hadoop.

enter image description here

Hope this helps !

Upvotes: 1

Related Questions