Nova
Nova

Reputation: 618

How to overlay scatterplots in ggplot when one plot has colors defined in its dataframe?

I am trying to overlay two scatter plots. Here is the base code:

ggplot() + geom_point(data = df, aes(A, B, color = Cluster), shape=1)  + 
  geom_point(data = as.data.frame(centers), aes(A, B), shape=13, size=7, alpha = 5)

This is what the plot looks like: enter image description here

But when I attempt to add a color to the overlaid cluster centers (those circles with X inside):

ggplot() + geom_point(data = df, aes(A, B, color = Cluster), shape=1)  + 
  geom_point(data = as.data.frame(centers), aes(A, B, color = "red"), shape=13, size=7, alpha = 5)

I get the following error: "Error: Discrete value supplied to continuous scale"

Here is a portion of the dataframe I am using to plot the first of two overlays:


> df
              A             B Cluster
1    1.33300195 -1.4524680585       2
2    1.41102294 -0.7889431279       2
3    1.36350553 -1.4437548005       2
4    1.61462300 -0.7145174514       2
5   -0.64722704  0.8449845639       1
6    1.33855918 -0.9161504530       2
7    1.33467865 -2.1513899524       2
8    1.50842550 -0.5170262065       2
9    1.67045671 -0.3644476090       2
10   1.32328373 -1.5496692059       2

My theory is that ggplot is interpreting the "Cluster" column of that dataframe as a continuous variable. Is there a way to change it so its discrete? Should I instead use a column of colors as factors? For example: 1 becomes "Blue", 2 becomes "Black"?

Upvotes: 1

Views: 2486

Answers (1)

Duck
Duck

Reputation: 39595

This should work. No data for centers so can not add that to the plot. You are right in the fact that the continuous variable is messing the plot. Instead set it as factor() and use scale_color_manual() to change the colors. Here the code:

library(ggplot2)
#Code
ggplot() + geom_point(data = df, aes(A, B, color = factor(Cluster),
                                     fill = factor(Cluster)))  + 
  geom_point(data = as.data.frame(centers), aes(A, B, color = "red"),
             shape=13, size=7, alpha = 5)+
  scale_color_manual(values=c('blue','black'))+labs(color='Cluster',fill='Cluster')

Output:

enter image description here

Or keeping the original shape:

#Code 2
ggplot() + geom_point(data = df, aes(A, B, color = factor(Cluster)),shape=1)  + 
  geom_point(data = as.data.frame(centers), aes(A, B, color = "red"),
             shape=13, size=7, alpha = 5)+
  scale_color_manual(values=c('blue','black'))+labs(color='Cluster')

Output:

enter image description here

Upvotes: 1

Related Questions