Subhradip Bose
Subhradip Bose

Reputation: 3305

How can i implement LDA using apache mahout?

have a data set like as bellow in CSV format.

FileName,Topic,Tag,Frequency
File-1,Topic -1,Tag-1,10
File-2,Topic -2,Tag-2,10
File-3,Topic -3,Tag-2,10
File-4,Topic -4,Tag-4,10
File-5,Topic -1,Tag-5,10
File-6,Topic -3,Tag-1,10
File-7,Topic -1,Tag-1,10 

I need to find a correlation between the tags using mahout LDA(Latent Dirichlet allocation) algorithm. Can anybody please help me to find how to do that using Apache Mahout.

I am also confused that in exactly what input format mahout wants ?

It will be helpful if somebody please share some good stuff for mahout beginner

Upvotes: 1

Views: 1164

Answers (1)

codeviper
codeviper

Reputation: 78

I might be late in answering. But, Mahout no longer supports LDA for versions above 0.6 . One has to use Cvb instead of lda to accomplish the task of running topic models.

The following links can help You:

https://mahout.apache.org/users/clustering/lda-commandline.html https://mahout.apache.org/users/clustering/latent-dirichlet-allocation.html

Upvotes: 1

Related Questions