Reputation: 33
This is the code where I read the file that contain Hl7 messages and iterate through them using Hapi Iterator (from http://hl7api.sourceforge.net)
File file = new File("/home/training/Documents/msgs.txt");
InputStream is = new FileInputStream(file);
is = new BufferedInputStream(is);
Hl7InputStreamMessageStringIterator iter = new
Hl7InputStreamMessageStringIterator(is);
I want to make this done inside the map function? obviously I need to prevent the splitting in InputFormat to read the entire file as once as a single value and change it toString (the file size is 7KB), because as you know Hapi can parse only entire message.
I am newbie to all of this so please bear with me.
Upvotes: 1
Views: 1355
Reputation: 202
If you do not want your data file to split or you want a single mapper which will process your entire file. So that one file will be processed by only one mapper. In that case extending map/reduce inputformat and overriding isSplitable() method and return "false" as boolean will help you.
For ref : ( Not based on your code ) https://gist.github.com/sritchie/808035
Upvotes: 1
Reputation: 1982
As the input is getting from the text file, you can override isSplitable() method of fileInputFormat. Using this, one mapper will process the whole file.
public boolean isSplitable(Context context,Path args[0])
{
return false;
}
Upvotes: 0
Reputation: 4179
You will need to implement you own FileInputFormat subclass:
isSplittable()
method to false
which means that number of mappers will be equal to number of input files: one input file per each mapper. getRecordReader()
method. This is exactly the class where you need to put you parsing logic from above to. Upvotes: 1