Reputation: 90

how to proceess individul file in hadoop using MR code

I have a file having fileds meterid, hour, watts ... and some other field. I made composite key meterid and hour and sum watts for each hour for each meterid. MR code work fine for single file.

I have multiple file and file name is date like

14-05-2015.txt, 15-05-2015.txt etc.

I execute same code and it add all watts corresponding to meterid and hour. but i want watts will sum for each file corresponding to meter id and hour not for all file.

Upvotes: 0

Answers (2)

Soma Sekhar Kuruva

Reputation: 35

If your fields are in structured format, read entire line as to the string array with field.sepeater, in this string to array meterID as key and value = hours*no of watts (S[2]*s[3]), it will work for all lines ..

Upvotes: 0

suresiva

Reputation: 3173

To solve this easily, you may include the filename too in the composite key that you compose. This will help you to group the keys generated for every file seperately before the reduce phase.

So first you have to find the file name from your Mapper class's setup() method, you may use the below snippet in your mapper setup() method,

String fileName = ((FileSplit) context.getInputSplit()).getPath().toString();

Add this file name also in your composite key with respective equality check implementations, and the keys will be grouped with considering the file name which will solve your problem. hope this helps.

Upvotes: 2

how to proceess individul file in hadoop using MR code

Answers (2)

Related Questions