nano7
nano7

Reputation: 2493

Hadoop Map Reduce: MapOutputValueClass : Map<String, String>?

I have a Java MR Program. The Output of my Map Method are various Strings/Numbers, which I put at the moment into a String. In Reduce I split the String then and work with the Parameters. Now I wonder if this couldnt be done easier.

I was thinking of a Map where I store my String/Numbers as Values with a named key which describes each of the values. This Map would be my "Value Out" (MapOutputValueClass) then.

Is this possible? As I read this in the docu I guess my idea is not implementable:

The key and value classes have to be serializable by the framework and hence need to implement the Writable interface. Additionally, the key classes have to implement the WritableComparable interface to facilitate sorting by the framework.

So what would you advice me to choose for my MapOutputValueClass? :-) Maybe take a Map and convert it in ImmutableBytesWritable? I also don't want to slow down my program...

Thanks for answers!

Upvotes: 0

Views: 1735

Answers (1)

Chun
Chun

Reputation: 279

You can write your own class with the various Strings/Numbers. and pass it as the output value class for mapper and the input value class for the reducer, for example.

Class Foo{
     String A;
     String B;
     int c, d;

      ....
}

in your mapper:

public class MyMapper extends Mapper<Text, Text, Text, Foo>{
      ....
}

in your reducer:

public class MyReducer extends Reducer<Text, Foo, Text, LongWritable>{
       ...
}

in your driver:

set the mapper output value class:

job.setMapOutputValueClass(Foo.class);

Remember when you extends Mapper, the classes you need to fill in is in this order: <KEYIN_CLASS, VALUEIN_CLASS, KEYOUT_CLASS, VALUEOUT_CLASS> , same thing for Reducer

Upvotes: 1

Related Questions