Lin Ma
Lin Ma

Reputation: 10139

refer elements in bag in Pig on Hadoop

I have an alias called student, the data structure is like this (result of command describe),

studentIDInt:int,courses:bag{(courseId:int,testID:int,score:int)}

Then I am trying to filter students by score, but met with such Pig parse error, if anyone have any good ideas, it will be great. Thanks.

Confused about the additional tuple reported in the error message.

student = filter student by courses.score > 3;

incompatible types in GreaterThan Operator left hand side:bag :tuple(score:int)  right hand score:int

regards, Lin

Upvotes: 0

Views: 846

Answers (1)

Alexey
Alexey

Reputation: 2478

You can't do it directly. Possible solution is first flatten, filter and than group again

flat_student = foreach student generate studentIDInt, flatten(courses);
filtered_student = filter flat_student by score > 3;
final_student = group filtered_student by studentIDInt;   

Another way is writing custom FilterFunc, so it's up to you what to choose.

Upvotes: 1

Related Questions