Reputation: 397
I have a file containing data in the format: director movie
I'm using hadoop and Java to process it.
It's pretty basic to count the number of movies for every director, but how can I modify the code to get something like this:
director movie1 movie2 movie3...
Upvotes: 0
Views: 774
Reputation: 179
This will do.
@Override
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
String movies;
for (Text value : values) {
movies += value.toString() + " ";
}
context.write(key, new Text(movies));
}
Upvotes: 1
Reputation: 756
I think it's straightforward from counting the number of movies for every director, high level structure may like this:
mapper(file):
for each (director, movie) in file:
emit(director, movie)
reducer(director, movies):
movielist = []
for each movie in movies:
movielist.add(movie)
emit(director, movielist)
Upvotes: 3