Reputation: 81
How can I make the colouring of points specific? As the code shows below which I used the last color overwrites the first colour and now on my legend I have data1 and data2 with the same colour, which is not what I want.
ggplot(data,aes(x,y,color,group))+
geom_point(data1,aes(fill="data1"),shape="*",size=12,color="blue")+
geom_point(data2,aes(fill="data2"),shape="*",size=12,color="red")
Just to highlight that data1 and data2 where derived by data on certain conditions
Upvotes: 0
Views: 623
Reputation: 13893
First, let's talk about what's really going on in your example. Then, I'll provide two ways to solve it.
Here's a reprex of OP's example:
library(ggplot2)
set.seed(8675309)
data1 <- data.frame(x=1:10, y=rnorm(10, 10))
data2 <- data.frame(x=sample(1:10, 10, replace=TRUE), y=rnorm(10, 11, 0.4))
ggplot(mapping= aes(x=x, y=y)) +
geom_point(data=data1, aes(fill='data1'), shape='*', size=12, color='blue') +
geom_point(data=data2, aes(fill='data2'), shape='*', size=12, color='red')
On first glance it looks like the points are colored as we wanted them to be, but the legend is colored only by the last geom_point
. But... that's not actually what's going on here. In reality, the result in the legend is due to drawing a red point on top of a blue point in both legend keys. We can demonstrate this very clearly when you change the shape of the blue point:
ggplot(mapping= aes(x=x, y=y)) +
geom_point(data=data1, aes(fill='data1'), size=12, color='blue') +
geom_point(data=data2, aes(fill='data2'), shape='*', size=12, color='red')
The reason for the overplotting is simple: OP has set the fill
aesthetic in aes()
and then adjusted the color
modifier discretely. Therefore, the legend does not reflect the difference in color
, but the difference in fill
. Since "*" is not a shape that has a fill
, there is no difference in appearance other than the difference in color
.
There are two ways to fix this. Both involve moving color
from outside aes()
to inside aes()
. One way maintains the two datasets data1
and data2
as separate data frames as OP has it, where we have a geom_point
call for each dataset, and the second way applies Tidy Data principles and is generally much better practice for plotting with ggplot2
.
The non-Tidy Way
Move color
inside aes()
for both geom_point
calls and remove fill
, since it doesn't apply here. The result of doing this will mean that ggplot
will create a legend and add "data1" and "data2" to that legend. Colors are chosen automatically, but if we want to specify the color, we can use scale_color_manual()
:
ggplot(mapping= aes(x=x, y=y)) +
geom_point(data=data1, aes(color='data1'), shape='*', size=12) +
geom_point(data=data2, aes(color='data2'), shape='*', size=12) +
scale_color_manual(values=c('blue', 'red'))
By the way, if you keep color
inside and outside aes()
, the color
outside of aes()
will overwrite the one inside the aes()
function. This means your points will be the right color, but no legend is drawn.
The Tidy Data Way
Again, this way is much more preferred. The idea is that you should combine your datasets into one, adding a column to differentiate the origin of the data. You then use that column to indicate how to label and color the points. You only need one call to geom_point
to make this work. It may not look so much improved in this particular example, but consider what the difference would be if you had 10 datasets.
library(dplyr)
library(tidyr)
# note we add a named list to ensure the id column is correctly populated
df <- bind_rows(list(data1=data1, data2=data2), .id="id")
ggplot(df, aes(x=x, y=y, color=id)) + geom_point(shape='*', size=12) +
scale_color_manual(values=c('blue', 'red'))
The resulting plot is identical to the other one.
While not a part of the question, OP indicated that in their particular case, there was already a color
aesthetic defined (so the values sent for scale_color_manual()
were not sufficient. There are some options for how to proceed here:
c("data1" = "blue", "data2" = "red", ...
).*
point shape and color, but override the aesthetics in the legend.Without the actual data from the OP and the code they are using specifically that includes the conflicting color aesthetic, it's difficult to suggest the best course for that particular case; however, I'll demonstrate the final two approaches here:
ggplot(mapping= aes(x=x, y=y)) +
geom_point(data=data1, aes(fill='data1'), shape=21, size=12, color='NA') +
geom_point(data=data2, aes(fill='data2'), shape=21, size=12, color='NA') +
scale_fill_manual(values=c('data1'='blue', 'data2'='red'))
ggplot(mapping= aes(x=x, y=y)) +
geom_point(data=data1, aes(fill='data1'), shape='*', size=12, color='blue') +
geom_point(data=data2, aes(fill='data2'), shape='*', size=12, color='red') +
guides(
fill=guide_legend(override.aes = list(color=c('blue','red')))
)
Upvotes: 1