Reputation: 502
I want to be able to group by hours here, i know i will have multiple entries of hours filed. For example 11th hour like below will appear multiple times. How do i do this?
hour,windSpeed
11, 3.6
2 , 6.8
11, 2.5
13, 5.0
14, 8.9
11, 3.2
So i have this and i only want to group by hour
so for example
we'd like {11: 3.6, 2.5, 3.2 }
and remanings since only one value will group to it's own
{14: 8.9}
{2: 6.8}
answer = FOREACH weather_data GENERATE $0 AS hour, $1 as speed
Upvotes: 0
Views: 86
Reputation: 186
Try this.
A = LOAD 'data' AS (Hour:chararray, windSpeed:chararray);
B = GROUP A BY (Hour);
C = FOREACH B GENERATE
FLATTEN(group) AS (Hour), A.windSpeed
;
Note: This is an untested code
Upvotes: 1
Reputation: 11080
Group by hour
A = FOREACH weather_data GENERATE $0 AS hour, $1 as speed;
B = GROUP A by hour;
DUMP B;
If you want to aggregate then use sum
C = FOREACH B generate group as hour,SUM(A.speed) as Total;
DUMP C;
Upvotes: 1