Reputation: 18594
Consider a data file:
4, 8, 2
5, 2, 5
3, 1, 7
I want to calculate the average of every column. What's the easiest way to do this?
What if I have 20 columns, is there a loop, so that I don't have to calculate it for every column manually?
Upvotes: 1
Views: 585
Reputation: 4724
Can you try this?
A = LOAD 'input.txt' USING PigStorage(',') AS(a,b,c);
B = FOREACH A GENERATE AVG(TOBAG(*));
DUMP B;
Output:
(4.666666666666667)
(4.0)
(3.6666666666666665)
Update: Avg of each columns
A = LOAD 'input.txt' USING PigStorage(',') AS(a,b,c);
B = GROUP A ALL;
C = FOREACH B GENERATE AVG(A.a),AVG(A.b),AVG(A.c);
DUMP C;
Output:
(4.0,3.6666666666666665,4.666666666666667)
Upvotes: 2