how to pass 'Text' as Mapper input key in hadoop job?

Question

My file content will be something like this.

TestKey, TestValue
TestKey1, TestValue1

I would like to pass the Mapper Key as TestKey and Mapper Value as TestValue and so on.

So i have tried to write Custom RecordReader to achieve this.

But its throwing error like Cannot caste LongWritable with Text.

How do i pass Text as my mapper input key?

Any help on this highly appreciated.

Thanks, Shankar

jason · Accepted Answer

It looks like you need to change the input format to KeyValueTextInputFormat and set the separator to mapreduce.input.keyvaluelinerecordreader.key.value.separator to ", ".¹

The default input format is TextInputFormat which uses a byte offset into the file, formatted as a LongWritable as the key, and the line as the value. That's why you're currently seeing an error.

¹: This assumes that you're on the new API; there is something similar for the old API.

how to pass 'Text' as Mapper input key in hadoop job?

Answers (1)

Related Questions

how to pass &#39;Text&#39; as Mapper input key in hadoop job?

Answers (1)

Related Questions

how to pass 'Text' as Mapper input key in hadoop job?