smatthewenglish
smatthewenglish

Reputation: 2889

Why would you define return value as function parameters in Hadoop?

At the moment I'm going through the Hadoop documentation about Mapper Class

In the signature (is that the right nomenclature?), we have to specify what we put into it, and also what comes out:

Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

Does that mean we need to define and instantiate these data structures outside of where we call this from?

Upvotes: 0

Views: 55

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191728

You need to define the InputFormat and OutputFormat of the specific MapReduce task, yes. It's not the return value, though, it's what's written to the Context output.

This is all configured via the Job class.

The "signature" as you call it, is no different than any other application of Java generics.

For the default TextInputFormat + LineRecordReader, they use LongWritable and Text, for those parameters.

Other formats and Writables are already defined by the Hadoop libraries, if that's your question.

You're welcome to define your own, though

Upvotes: 1

Related Questions