realnumber
realnumber

Reputation: 2284

Hadoop support for php, ruby

I'm wonder upto what level of hadoop programming can be done using PHP or Ruby. I found articles taking about hadoop streaming api which can be hacked from PHP, Ruby.

My questions

  1. Can you write a map-reduce job in PHP, Ruby which can work with other hadoop java map-reduce jobs?

  2. In-terms of API level programming what is missing for non-java languages in hadoop? i.e something that can be done only in java right now vs other languages ?

Thanks

Upvotes: 0

Views: 618

Answers (2)

David Gruzman
David Gruzman

Reputation: 8088

In a nutshell - hadoop has number of other plugins aside of mappers and reducers: combiners, input/output formats comparators. These plagins can be written in java only.
So it means that using hadoop via streaming can suite some simple cases but will seriously reduce your flexibility.
Streaming is also somewhat slower because different mechanisms are used to pass records to the mappers and reducers.

Upvotes: 2

sa125
sa125

Reputation: 28971

  1. If you're referring to chaining java M/R jobs (e.g - native API) with streaming jobs - I saw this seemingly relevant answer on a similar thread.
  2. What you do in the Streaming script is basically limited to the capabilities of the language you pick. Both Ruby and PHP are fairly powerful, so I'm not sure what it is you're missing in these.

Personally, I also come from a Ruby/Python background, and at first tried using streaming to get things done. Eventually I decided to give the Java API a chance, and it turned out to be not too bad :)

Upvotes: 1

Related Questions