Josh Smith
Josh Smith

Reputation: 15042

Just how much Java does one need to use Hadoop and Mahout effectively?

I'm a PHP developer. Let's just get that out of the way now. But Hadoop – and Mahout in particular – have piqued my interest. I'm ready to take the dive into Java in order to use them.

So from people experience enough to know, just how much Java will I need to be able to use these effectively? From what I've seen, programming mappers/reducers doesn't take all that much. But with Mahout I'm not at all sure what I'm looking at when I look at the documentation.

Also, just how hard will it be to take data from my PHP application for processing in Java via Hadoop and Mahout? I can't imagine it'd be that difficult, but I'm not experienced enough to say.

Upvotes: 5

Views: 2095

Answers (4)

nicolai.tesela
nicolai.tesela

Reputation: 86

For real-time recommendations you could also instantiate an instance of mahout in a java servlet class, then serve export that as a war to serve up on a tomcat server.

Upvotes: 0

Jilles
Jilles

Reputation: 748

I just did the same thing, and it's been years I did anything Java related. What I did was the following:

  1. Started off with simple Hadoop streaming examples
  2. Try my own analysis with PHP streaming
  3. Started experimenting with Pig
  4. Start experimenting with using PHP streaming inside Pig

All without any Java!

Upvotes: 1

Harsha Hulageri
Harsha Hulageri

Reputation: 2830

Beginner level of Java is sufficient. You can always dug deep on adhoc need basis.

Upvotes: 1

Ted Dunning
Ted Dunning

Reputation: 1907

It shouldn't be all that difficult to get data from PHP to Java for analysis using Mahout and Hadoop.

Even easier is to process using Mahout and Hadoop off-line in a batch mode and to store the data products in a file system or database. PHP can then read these data products as easy as falling off a log.

For real-time use, the recommendations part of Mahout supports a variety of web-service interfaces that make it pretty easy to access from PHP. Hitting the model evaluation part of Mahout would require a bit more programming.

Upvotes: 7

Related Questions