javanx
javanx

Reputation: 698

Hadoop Map Output Type For Performance

I have a bunch of fields to be written out by the Mapper around 10 fields. Which way would be faster 1. write out the fields as the following

    tradeDate.readFields(in);
    marketMakerId.readFields(in);
    eventTime.readFields(in);
    bidPrice.readFields(in);
    ......................... 

or 2. convert them to a single Text field (tradeDate,marketId,evenTime,bidPrice....) and construct back the object on the Reducer.

Which way could give a better performance out of these?

Upvotes: 0

Views: 142

Answers (1)

octo
octo

Reputation: 665

As usual, benchmarks could help. You can use Caliper to check hypothesis.

But in general, binary formats faster, when text<->binary conversions involved. Consequently, I think, binary read/writeFields will work faster.

Upvotes: 1

Related Questions