Raghav Sharma
Raghav Sharma

Reputation: 49

Kmeans clustering error: Issue plotting the clusters

I am reading data from a dataFrame I created earlier. I have to select a few vectors from my data frame to do this task. However when I rescale my dataframe object, it becomes a "double" (not a "list" as it was).

I can successfully cluster the data, but I am not able to plot it.

But I can't plot it in a simple 2D plot, as I am getting the following error:

Error: data must be a data frame, or other object coercible by fortify(), not a list

I tried to also use as.list to convert the ilpd_df2 to convert it back from "double" to a "list" but it is still not plotting.

#Task 2.1 - Load Preprocessed Data and Subset Data as directed
    ilpd_df <- readRDS(file="ilpd_preprocessed.Rda")
    ilpd_df1 <-
    select(ilpd_df,"TB","DB","Alkphos","Sgpt","Sgot","TP","Albumin")

#Task 2.2 - Re-Scaling
    ilpd_df2 <- apply(ilpd_df1, MARGIN = 2, FUN=function(X) 
    (X - min(X))/diff(range(X)))

#Task 2.3 - Cluster the Data into 2 Clusters
    set.seed(44)
    ilpd_clusters <- kmeans(ilpd_df2, 2, nstart = 25, iter.max=5)
    ggplot(ilpd_df2, aes(Alkphos, TP)) + geom_point()


Error: `data` must be a data frame, or other object coercible by `fortify()`, not a list

Upvotes: 0

Views: 232

Answers (1)

Raghav Sharma
Raghav Sharma

Reputation: 49

My error got pointed out to me by @Elin, who suggested the following:

1) using dplyr::select(), would yield a tibble. 2) coerce the tibble to a dataframe

According to my own understanding and @Elin's guidance, I did the following:

Change tibble to a dataframe

ilpd_df2 <- as.data.frame(ilpd_df2)

Then I ran the code and following plot was generated: plot generated post code

Upvotes: 1

Related Questions