Reputation: 8404
I have the dataframe below which I process properly in order to create a cluster scatter plot with:
library(tidyverse) # data manipulation
library(cluster) # clustering algorithms
library(factoextra) # clustering algorithms & visualization
library(plotly)
df <- USArrests
df <- na.omit(df)
df <- scale(df)
distance <- get_dist(df)
k2 <- kmeans(df, centers = 2, nstart = 25)
df %>%
as_tibble() %>%
mutate(cluster = k2$cluster,
state = row.names(USArrests))
p2<-fviz_cluster(k2, data = df, geom="point")
#+ scale_fill_discrete(name = "Cluster", labels = c("1", "2", "3","4"))
p2
ggplotly(p2)
When I use ggplotly()
then the legend names change so Im looking for a way to set them manually or even hide the legend at all.
Upvotes: 1
Views: 1842
Reputation: 506
the easiest way I came a cross is renaming the label within the object.
p2<-fviz_cluster(k2, data = df, geom="point")
p3 <- ggplotly(p2)
p3[["x"]][["data"]][[2]][["name"]] <- "2"
p3
It's not pretty but helps in the short term.
Edit: so there was more than one question First: is about legend labels Second: about interactive points in the plot # Most of the example code was given, # only change of center variable
# Example
library(tidyverse) # data manipulation
library(cluster) # clustering algorithms
library(factoextra) # clustering algorithms & visualization
library(plotly)
df <- USArrests
df <- na.omit(df)
df <- scale(df)
distance <- get_dist(df)
# added center variable for number of centers in kmeans
# this will also be used to select elemnets from ggplot or ggplotly later
centers=4
k2 <- kmeans(df, centers = centers, nstart = 25)
df %>%
as_tibble() %>%
mutate(cluster = k2$cluster,
state = row.names(USArrests))
p2<-fviz_cluster(k2, data = df, geom="point")
p2
p3 <- ggplotly(p2)
# Solution
# First Problem: Changing legend labels
# Because the transition from ggplot to ggplotly
# messes up multiple scales like here (color and shape)
# Why it looks like intended when only changing the point layer,
# I don't know
for (i in 1:centers) {
p3[["x"]][["data"]][[i]][["name"]] <- i
}
# Second Problem: interactive points
# ggplot saves the data in one list and ggplotly splits the data
# depending on layer and cluster
# for the labels it is enough to change the point layers
# (the first x depending on num. of centers)
# to add more inforamtion to labels
# manipulate the variable names_states with html
for (i in 1:centers) {
name_states <- p2[["data"]]%>%
filter(cluster==i)%>%
select(name)
p3[["x"]][["data"]][[i]][["text"]] <- as.vector(name_states$name)
}
# Changing order of layers because polygon-layer is on top and
# makes it impossible to hover over points beneeth
p3[["x"]][["data"]] <- p3[["x"]][["data"]][(centers*3):1]
# Now you can hover over every point and can see the state name
p3
Upvotes: 2