Reputation: 1214
I'm pretty confused about using Avro with map reduce and can't find good tutorials to follow.
It seems that classes like AvroJob and AvroMapper are geared for problems when both input and output are Avro data files. What about when your input is just plain text?
Specifically:
My mapper takes LongWritable keys and Text values as input. It emits Text keys and MyAvroRecord values.
My reducer takes Text keys and an Iterator of MyAvroRecords as input, and emits Text keys and MyAvroRecord values.
How do I get an OutputFormat that would write these Text keys and MyAvroRecord values to file?
Cheers, Dave
Upvotes: 6
Views: 2231
Reputation: 1
Another way of approaching can be : the output of mapper need not be AvroKey and AvroValue. It can be your general output types, which become input to your reducer . In reducer we can do the Avro conversion. By setting the Outputformat type to Avro.
regards, sujoy
Upvotes: 0
Reputation: 1214
Ok, so I figured this out.
Rather than a mapper that outputs Text keys and MyAvroRecord values, I needed one that produced AvroKey keys and AvroValue values. That was able to feed its results straight onto an AvroReducer, and I could just use AvroJob.setOutputSchema() to handle the output (I didn't have to implement an OutputFormat at all).
Upvotes: 6