Evgenij Reznik
Evgenij Reznik

Reputation: 18594

Calculating the average in Pig Latin

Consider a data file:

4, 8, 2
5, 2, 5
3, 1, 7

I want to calculate the average of every column. What's the easiest way to do this?
What if I have 20 columns, is there a loop, so that I don't have to calculate it for every column manually?

Upvotes: 1

Views: 585

Answers (1)

Sivasakthi Jayaraman
Sivasakthi Jayaraman

Reputation: 4724

Can you try this?

A = LOAD 'input.txt' USING PigStorage(',') AS(a,b,c);
B = FOREACH A GENERATE AVG(TOBAG(*));
DUMP B;

Output:

(4.666666666666667)
(4.0)
(3.6666666666666665)

Update: Avg of each columns

A = LOAD 'input.txt' USING PigStorage(',') AS(a,b,c);
B = GROUP A ALL;
C = FOREACH B GENERATE AVG(A.a),AVG(A.b),AVG(A.c);
DUMP C;

Output:

(4.0,3.6666666666666665,4.666666666666667)

Upvotes: 2

Related Questions