Drizzt321
Drizzt321

Reputation: 1021

How to read a file from HDFS in a non-Java client

So my MR Job generates a report file, and that file needs to be able to be downloaded by an end-user who needs to click a button on a normal web reporting interface, and have it download the output. According to this O'Reilly book excerpt, there is an HTTP read-only interface. It says it's XML based, but it seems that it's simply the normal web interface intended to be viewed through a web browser, not something that can be programatically queried, listed, and downloaded. Is my only recourse to write my own servlet based interface? Or execute the hadoop cli tool?

Upvotes: 1

Views: 2808

Answers (2)

Niels Basjes
Niels Basjes

Reputation: 10642

The way to access HDFS programmatically from something other than Java is by using Trift. There are pre-generated client classes for several languages (Java, Python, PHP, ...) included in the HDFS source tree.

See http://wiki.apache.org/hadoop/HDFS-APIs

Upvotes: 3

Eric Wendelin
Eric Wendelin

Reputation: 44349

I'm afraid you will probably have to settle with the CLI AFAIK.

Not sure if it would fit your situation, but I think it would be reasonable to have whatever script that kicks off the MR job do a hadoop dfs -get ... after job completion to a known directory that's served.

Sorry that I don't know of an easier solution.

Upvotes: -1

Related Questions