How can you integrate Mahout KMeans Clustering Into application?

Question

I am trying o use Mahout KMeans for a simple application. I manually create a series of Vectors from database content. I simply want to feed these vectors to Mahout (0.9) for example KMeansClusterer and use the output.

I read Mahout in Action (examples from version 0.5) and many online fora to get background. But I can see no way any longer to use Mahout KMeans (or related clustering) without file name and file path usages via Hadoop. The documentation is very sketchy, but can Mahout be used in this way any more? Are there any current examples of using Mahout KMeans (not from command line).

    private List kMeans(List allvectors, double closeness, int numclusters, int iterations) {
    List clusters = new ArrayList() ; 

    int clusterId = 0;
    for (Vector v : allvectors) {
        clusters.add(new Kluster(v, clusterId++, new EuclideanDistanceMeasure()));
    }

    List> finalclusters = KMeansClusterer.clusterPoints(allvectors, clusters, 0.01, numclusters, 10) ;  


    for(Cluster cluster : finalclusters.get(finalclusters.size() - 1)) {
        System.out.println("Fuzzy Cluster id: " + cluster.getId() + " center: " + cluster.getCenter().asFormatString());
    }

    return clusters ;
}

How can you integrate Mahout KMeans Clustering Into application?

Answers (1)

Related Questions