Jurgen
Jurgen

Reputation: 451

Java Generics & Hadoop: how to get a class variable

I'm a .NET programmer doing some Hadoop work in Java and I'm kind of lost here. In Hadoop I am trying to setup a Map-Reduce job where the output key of the Map job is of the type Tuple<IntWritable,Text>. When I set the output key using setOutputKeyclass as follows

JobConf conf2 = new JobConf(OutputCounter.class);
conf2.setOutputKeyClass(Tuple<IntWritable,Text>.class);

I get a whole bunch of errors because generics and the ".class" notation don't seem to fly. The following works fine though

JobConf conf2 = new JobConf(OutputCounter.class);
conf2.setOutputKeyClass(IntWritable.class);

Anyone have any pointers on how to set the output key class?

Cheers, Jurgen

Upvotes: 1

Views: 952

Answers (1)

Yishai
Yishai

Reputation: 91921

In java, generics are erased at compile time, so the best you can do is:

 conf2.setOutputKeyClass(Tuple.class);

If you can, to make this better, you can subclass Tuple to keep a type at runtime:

 public class IntWritableTextTuple extends Tuple<IntWritable, Text> {}

And then use that as your parameter to setOutputKeyClass.

Note, I know nothing about Hadoop, so this may not make any sense there, but in general with java Generics, this is what you do.

Upvotes: 4

Related Questions