Reputation: 1715
I am a newbie to Hadoop and I have a situation where only one line per 4 lines of the input text is relevant. Currently I am using the default TextInputFormat
and a conditional logic to skip all the other three lines which is irrelevant.
How can I use a Custom Input Format
to handle this. Since Am new to hadoop I don't know much about CustomInputFormat
. Any help would be appreciated. Thanks !
Upvotes: 0
Views: 714
Reputation: 6273
I think you can use NLineInputFormat
where you can specify how many line constructs one record. This could be easy & ready to use solution.
If you want to implement your own input format then it you would probably implement custom input format & record reader to specify what constructs your one record.
below is one of of the example http://deep-developers.blogspot.in/2014/06/custom-input-split-and-custom.html
Upvotes: 1