Punit Naik
Punit Naik

Reputation: 515

How to group a DataSet on multiple keys?

I have a Dataset of HashMaps and I have performed a groupBy on one of the keys. Now I want to perform one more (or multiple) groupBy on the already grouped DataSet (i.e. nested or a chained sort of a groupBy)

for example, I would like something like this:

data.groupBy(_("a")).groupBy(_("b")).reducegroup {....}

How will I be able to do this?

Upvotes: 1

Views: 1066

Answers (1)

Fabian Hueske
Fabian Hueske

Reputation: 18987

Nested groupBy is not supported in Flink. You can however group on composite keys

val data: (Int, Int, Long) = ???
val res = data.groupBy(0, 1).reduce(...)

In your example you are using KeySelectors which cannot be used in composite keys. Instead, you can define a KeySelector that returns both keys as a Tuple2.

data.groupBy(d => (d("a"), d("b")) ).reduce(...)

Upvotes: 3

Related Questions