datadigger
datadigger

Reputation: 181

K-means clustering error: only 0's may be mixed with negative subscripts

I am trying to do kmeans clustering on IRIS data in R. I want to use KKZ option for the seed selection (starting points of clusters).

If i dont standardize the data i have no issues with the KKZ command:

library(inaparc)
res<- kkz(x=iris[,1:4], k=3) 
seed <- res$v        # this gives me the cluster seeds based on KKZ method
k1 <- kmeans(iris[,1:4], seed, iter.max=1000)

However, when i scale the data first, then kkz command gives me the error:

library(ClusterR)
dat <- center_scale(iris[1:4], mean_center = TRUE, sd_scale = TRUE)  # scale iris data
res2 <- kkz(x=dat, k=3)
**Error in x[-x[i, ], ] : only 0's may be mixed with negative subscripts**

I think this is an array indexing thing but not sure what it is and how to solve it.

Upvotes: 2

Views: 404

Answers (1)

StupidWolf
StupidWolf

Reputation: 46928

For some reason, kkz cannot take in anything with a mixture of positive and negative values. I have a lot of problems running it, for example:

#ok
set.seed(1000)
kkz(matrix(rnorm(1000,5,1),100,10),3)
# not ok
kkz(matrix(rnorm(1000,0,1),100,10),3)
Error in x[-x[i, ], ] : only 0's may be mixed with negative subscripts

You don't really need to center your values, so you can do:

dat <- center_scale(iris[1:4], mean_center = FALSE, sd_scale = TRUE)
res2 <- kkz(x=dat, k=3)

I would be quite cautious about using this package..until you figure out why it is so..

Upvotes: 1

Related Questions