Laure D
Laure D

Reputation: 887

sum in apache pig: error 1045

I want to compute a division of 2 sums using pig

A = LOAD 's3://input' AS (filed1:chararray, filed2:int, field3:float, field4:float);
filtered_1 = FILTER A BY field3 >= 10;
filtered_2 = FILTER filtered_1  BY field4 >= 50;
grouped = GROUP filtered_2 BY field1;
B = FOREACH grouped GENERATE group as field1, SUM(A.field3)/SUM(A.field4) AS A_avg;

except I have this error while running last command:

ERROR grunt.Grunt: ERROR 1045: <line 5, column 55> Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.

And I cannot find why since I use GROUP before performing my sum, and I have been through the SUM doc and I don't know what differs with what I wrote

Upvotes: 0

Views: 144

Answers (1)

KrazyGautam
KrazyGautam

Reputation: 2682

grouped = GROUP filtered_2 BY field1;

grouped has no ACCESS to alias A . 

B = FOREACH grouped GENERATE group as field1, SUM(A.field3)/SUM(A.field4) AS A_avg;

"FOREACH grouped " has no access to alias A but directly to fields (field3, field4)






filtered_1 = FILTER A BY field3 >= 10;
filtered_2 = FILTER filtered_1  BY field4 >= 50;


All you are doing in this statement is an AND operation 

filtered_3 = FILTER A BY field3 >= 10 AND field4 >= 50;

Now
grouped = GROUP filtered_3 BY field1;
B = FOREACH grouped GENERATE group as field1, SUM(filtered_3.field3)/SUM(filtered_3.field4) AS A_avg;

Upvotes: 1

Related Questions