lam
lam

Reputation: 13

Input to the Mapper in Hadoop

We can provide input files to the mapper as

FileInputFormat.setInputPaths(conf, inputPath);

Is it possible to pass a reference to memory say a DOM tree constructed using a DOM parser after parsing an XML file as an input to mapper function of the Hadoop framework.

What other possibilities are there?

Upvotes: 1

Views: 343

Answers (1)

Niels Basjes
Niels Basjes

Reputation: 10652

No, you can't specify memory (RAM) based information.

The reason is that in general Hadoop applications will be distributed over a lot of physically separated systems. The current version of Hadoop "only" supports distributed data using HDFS ... which is a file system.

What you can do is add the DOM parser as a preprocessing step to your mapper and simply specify your input test file as the input. You can most easily do that by creating your own derivative of FileInputFormat.

HTH

Upvotes: 1

Related Questions