Reputation: 33
I have some data like (name, score) A 10 B 25 C 15 A 5 A 36 B 98 C 78 C 78 B 12
data = LOAD 'demo.txt' using PigStorage (',') as (name : chararray , score : int);
groupScore = GROUP data by score;
totalscore = FOREACH groupScore Generate data.name , SUM(data.score);
when I'm using SUM() function, the output is coming out like
{(A)(A)(A), (51)}
{(B)(B)(B), (135)}
I'm wondering if there's is anyway I could show it like
{(A), (51)},
that is not repeating the "name" field for every occurrence?
Any guidance will help.
Upvotes: 0
Views: 49
Reputation: 1766
Below is the query for the solution
data = LOAD 'demo.txt' USING PigStorage(',') AS (name:chararray,score:int);
groupScore = group data by name;
result= FOREACH groupScore GENERATE group,SUM(data.score);
Output
(A,51) (B,135) (C,171)
Upvotes: 3
Reputation: 11080
Group by name
data = LOAD 'demo.txt' as PigStorage (',') using (name : chararray , score : int);
groupScore = GROUP data by name;
totalscore = FOREACH groupScore Generate data.name , SUM(data.score);
Upvotes: 0