SHW
SHW

Reputation: 501

Scatterplot in ggplot with specific color set. How to get the legend right?

So I want to introduce a more varied color pattern in my scatterplot, because I have high n and the discriminative property of the the standard set is not high enough. So, I've generated a color vector that works, but now I can't get the legend right. I either get no legend or a legend with the names of the colors. I am pretty sure I am mixing up the aesthetics and attributes, but I don't know what I'm doing wrong.

My code and three attempts are below. What I want to achieve is the colors matching the color vector I created (col_sample), but the names of the legend matching the names column in the dataframe.

#dataframe
df1 <- data.frame(name = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "a", "b", "c", "d", "e"),
             n = rep(1:31, 1),
             value = rep(31:1, 1))

df1$name <- as.factor(df1$name)


#produce color vector
color <- grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)] 
col_sample <- sample(color, 31)
col_sample <- as.vector(col_sample) 


#scatterplot
median_scatter <- ggplot(data = df1,
                     aes(x = n, 
                         y = value,
                         col = name))

#try 1: these colors are too similar
median_scatter +
  geom_point() 

#try 2: t he legend dissappears
median_scatter +
  geom_point(col = col_sample) 

#try 3: t he legend dissappears
median_scatter +
  geom_point(aes(col = col_sample))

Upvotes: 0

Views: 1000

Answers (1)

Djork
Djork

Reputation: 3369

You define the color scale manually using scale_colour_manual.

median_scatter <- ggplot(data = df1,
                     aes(x = n, 
                     y = value,
                     colour = name))
median_scatter + 
  geom_point() +
  scale_colour_manual(values=col_sample)

Note the legend is tied to the aes. In try #2 you have overridden the color aesthetics aes(col=name) in the parent ggplot by assigning a vector of colors to to col in geom_point. There is no association between the name and col_sample therefore no legend.

In try #3 you have reassigned the aes(col=col_sample), therefore color names now becomes the variable assigned to the default colors.

Upvotes: 2

Related Questions