Umar Farooque
Umar Farooque

Reputation: 2059

Coding to find Z score in Apache Mahout and compute similarity

I am new to apache mahout. I have managed to use it for pearson corelation and cosine vector but i need to normalize data and use Z Score to calculate similarity. I am unable to find methods in mahout which allow to do so. The mahout wiki also doesn't demonstrate the use of normalization of data and use for calculating similarity. I would be very thankful if someone can help me out with the code for the same.

Upvotes: 0

Views: 123

Answers (1)

Ted Dunning
Ted Dunning

Reputation: 1907

These questions are better answered on the mahout user mailing list.

In any case, it would be nice to understand what you are trying to do on a larger scale. It sounds like you might be trying to build a recommendation engine. If so, Pearson correlation is generally a really bad way to do that.

It is much better to use Mahout to compute indicator behaviors and then use a search engine such as Solr or ElasticSearch to deploy the recommendation function.

We described how to do this in the O'Reilly small book that you can get from:

https://www.mapr.com/practical-machine-learning

Upvotes: 1

Related Questions