Reputation: 167
I have the following data set for a movie database:
Ratings: UserID, MovieID, Rating
Movies: MovieID, Genre
I filtered out the movies with Genres as "Action" or "War" using:
movie_filter = filter Movies by (genre matches '.*Action.*') OR (genre matches '.*War.*');
Now, I have to calculate the average ratings for War or Action movies. But the ratings is present in the Ratings file. To do this, I use the query:
movie_groups = GROUP movie_filter BY MovieID;
result = FOREACH movie_groups GENERATE Ratings.MovieID, AVG(Ratings.rating);
Then I store the result in a directory location. But when I run the program, I get the following error:
Could not infer the matching function for org.apache.pig.builtin.AVG as multiple or none of them fit. Please use an explicit cast.
Can anyone tell me what I'm doing wrong? Thanks in advance.
Upvotes: 1
Views: 1225
Reputation: 2333
It looks like you're missing a join statement, which would join your two data sets (ratings & movies) on the MovieID column. I've mocked up some test data, and provided some example code below.
movie_avg.pig
ratings = LOAD 'movie_ratings.txt' USING PigStorage(',') AS (user_id:chararray, movie_id:chararray, rating:int);
movies = LOAD 'movie_data.txt' USING PigStorage(',') AS (movie_id:chararray,genre:chararray);
movies_filter = FILTER movies BY (genre MATCHES '.*Action.*' OR genre MATCHES '.*War.*');
movies_join = JOIN movies_filter BY movie_id, ratings BY movie_id;
movies_cleanup = FOREACH movies_join GENERATE movies_filter::movie_id AS movie_id, ratings::rating as rating;
movies_group = GROUP movies_cleanup by movie_id;
data = FOREACH movies_group GENERATE group, AVG(movies_cleanup.rating);
dump data;
Output of movie_avg.pig
(Jarhead,3.0)
(Platoon,4.333333333333333)
(Die Hard,3.0)
(Apocolypse Now,4.5)
(Last Action Hero,2.0)
(Lethal Weapon, 4.0)
movie_data.txt
Scrooged,Comedy
Apocolypse Now,War
Platoon,War
Guess Whos Coming To Dinner,Drama
Jarhead,War
Last Action Hero,Action
Die Hard,Action
Lethal Weapon,Action
My Fair Lady,Musical
Frozen,Animation
movie_ratings.txt
12345,Scrooged,4
12345,Frozen,4
12345,My Fair Lady,5
12345,Guess Whos Coming To Dinner,5
12345,Platoon,3
12345,Jarhead,2
23456,Platoon,5
23456,Apocolypse Now,4
23456,Die Hard,3
23456,Last Action Hero,2
34567,Lethal Weapon,4
34567,Jarhead,4
34567,Apocolypse Now,5
34567,Platoon,5
34567,Frozen,5
Upvotes: 3