Haroon Lone
Haroon Lone

Reputation: 2949

Cluster data using medoids (cluster centers) in R

I have a dataframe with three features as

library(cluster)
df <- data.frame(f1=rnorm(480,30,1),
                 f2=rnorm(480,40,0.5),
                 f3=rnorm(480,50, 2))

Now, I want to do clustering using K-medoids in two steps. In step 1, using some data from df I want to get medoids (cluster centers), and in step 2, I want to use obtained medoids to do clustering on remaining data. Accordingly,

# find medoids using some data 
sample_data <- df[1:240,]
sample_data <- scale(sample_data) # scaling features
clus_res1 <- pam(sample_data,k = 4,diss=FALSE)

# Now perform clustering using medoids obtained from above clustering
test_data <- df[241:480,]
test_data <- scale(test_data)
clus_res2 <- pam(test_data,k = 4,diss=FALSE,medoids=clus_res1$medoids)

With this script, I get an error message as

Error in pam(test_data, k = 4, diss = FALSE, medoids = clus_res1$medoids) : 
  'medoids' must be NULL or vector of 4 distinct indices in {1,2, .., n}, n=240

It is clear that error message is due to the input format of Medoid matrix. How can I convert this matrix to the vector as specified in the error message?

Upvotes: 0

Views: 582

Answers (2)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77485

The initial medoids parameter expects index numbers of points in your data set. So 42,17 means to use objects 42 and 17 as initial medoids.

By the definition of medoids, you can only use points of your data set as medoids, not other vectors!

Clustering is unsupervised. No need to split your data in training/test, because there are no labels to overfit to in unsupervised learning.

Upvotes: 0

Kozolovska
Kozolovska

Reputation: 1119

Notice that in PAM the clustering center is an observation, that is you get 4 observations that each of them is a center of cluster. Demonstration of PAM.

So if you want to try and use the same center, you need to find the observations which are closest to the observations who are the center in your train.

Upvotes: 0

Related Questions