van
van

Reputation: 397

How to get Hadoop output to Text, Text format?

I have a file containing data in the format: director movie

I'm using hadoop and Java to process it.

It's pretty basic to count the number of movies for every director, but how can I modify the code to get something like this:

director movie1 movie2 movie3...

Upvotes: 0

Views: 774

Answers (2)

Yashodhan K
Yashodhan K

Reputation: 179

This will do.

@Override
    public void reduce(Text key, Iterable<Text> values, Context context)
            throws IOException, InterruptedException {

        String movies;

        for (Text value : values) {
            movies += value.toString() + " ";
        }
        context.write(key, new Text(movies));
    }

Upvotes: 1

justmscs
justmscs

Reputation: 756

I think it's straightforward from counting the number of movies for every director, high level structure may like this:

mapper(file):
    for each (director, movie) in file:
        emit(director, movie)

reducer(director, movies):
    movielist = []
    for each movie in movies:
        movielist.add(movie)
    emit(director, movielist)

Upvotes: 3

Related Questions