Reputation: 13
We can provide input files to the mapper as
FileInputFormat.setInputPaths(conf, inputPath);
Is it possible to pass a reference to memory say a DOM tree constructed using a DOM parser after parsing an XML file as an input to mapper function of the Hadoop framework.
What other possibilities are there?
Upvotes: 1
Views: 343
Reputation: 10652
No, you can't specify memory (RAM) based information.
The reason is that in general Hadoop applications will be distributed over a lot of physically separated systems. The current version of Hadoop "only" supports distributed data using HDFS ... which is a file system.
What you can do is add the DOM parser as a preprocessing step to your mapper and simply specify your input test file as the input. You can most easily do that by creating your own derivative of FileInputFormat.
HTH
Upvotes: 1