Reputation: 1021
So my MR Job generates a report file, and that file needs to be able to be downloaded by an end-user who needs to click a button on a normal web reporting interface, and have it download the output. According to this O'Reilly book excerpt, there is an HTTP read-only interface. It says it's XML based, but it seems that it's simply the normal web interface intended to be viewed through a web browser, not something that can be programatically queried, listed, and downloaded. Is my only recourse to write my own servlet based interface? Or execute the hadoop cli tool?
Upvotes: 1
Views: 2808
Reputation: 10642
The way to access HDFS programmatically from something other than Java is by using Trift. There are pre-generated client classes for several languages (Java, Python, PHP, ...) included in the HDFS source tree.
See http://wiki.apache.org/hadoop/HDFS-APIs
Upvotes: 3
Reputation: 44349
I'm afraid you will probably have to settle with the CLI AFAIK.
Not sure if it would fit your situation, but I think it would be reasonable to have whatever script that kicks off the MR job do a hadoop dfs -get ...
after job completion to a known directory that's served.
Sorry that I don't know of an easier solution.
Upvotes: -1