Reputation: 784
I'm using Python and I want to implement groupBy over multiple columns in Apache beam. For example I have a below dataset with 3 columns :
GM TV 7500.2 abc
ONLINE 2000.1 def
CONSOLE 1000.2 ghi
CONSOLE 6500.6 ghi
GM TV 4500.5 abc
CONSOLE 9500.4 ghi
How can I group the data based on first an third column ?
Upvotes: 1
Views: 1823
Reputation: 1901
You can use a tuple (column 1, column 3)
as key in your GBK transform.
Upvotes: 2