nik686
nik686

Reputation: 705

Running mahout kmeans on hadoop multi node cluster

I am running kmeans on a multinode cluster.The input size is about 100mb and I have modified bin/mahout file like this

.

.

.

MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.min.split.size=10MB"

.

.

MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.tasks=10"

Over each iteration i get

12/09/12 17:05:02 INFO mapred.JobClient: Launched map tasks=1

12/09/12 17:05:02 INFO mapred.JobClient: Launched reduce tasks=6

12/09/12 17:05:02 INFO mapred.JobClient: Data-local map tasks=1

Does this mean that it runs on single node instead of multi node?And if so what do I miss in the configuration?

Upvotes: 1

Views: 572

Answers (1)

Sean Owen
Sean Owen

Reputation: 66876

Surely you want to set the max split size rather than min, if you want more splits. It is still only a suggestion to the cluster.

Upvotes: 3

Related Questions