Reputation: 475
I am following Chapter 1 of Wickham and Grolemund's "R for data science" on visualization.
I have tried:
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
hoping to achieve a plot with all points colored blue, but instead, to my surprise, they were all red! Reading the correct code to achieve the blue points, in page 11 of the printed version or in Section 3.3 of the online version, I found it should be
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
and, in fact, they state that, to manually set an aesthetic you have to give it outside the aes() function, but inside the corresponding geom, geom_point() here. Why is it so? What is the exact explanation for this behavior? In fact, it seemed natural to me that the correct syntax would be the one of the first command.I guess this issue is related either to layers and/or to scope of variables, but I just could not get the hang of it... Can someone spoon feed me?
Edit: Sorry for not doing my correct homework: this is just Exercise 1 proposed in the text itself at the end of the corresponding Section... The answer however still escapes me.
Upvotes: 7
Views: 3939
Reputation: 156
This is quite an old post, but I was stuck with the same problem for hours, and this discussion helped me to make things more clear. So here I go with a short answer.
Using the your first line of code (where color goes inside aes()), will not apply any coloring to your plot.
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))
Why not? If you check what's going inside aes(), you find displ (your x variable), and hwy (your y variable). How does "blue" fit in here? It actually doesn't. As "blue" (a string) doesn't exist in your dataframe, it's not applied to your plot as a new coloring aesthetic. Instead, it will only be added to your legend (here "blue" could have been any string).
In your second line of code, color goes outside aes(), and as you see, it works. In this case, with one colour only, you don't need to show a legend.
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = "blue")
In case you want to control the specific colors of your color aesthetic when used to a third variable (drv in this case), you should use scale_fill_manual().
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = drv))+
+scale_color_manual(values=c("green", "yellow", "red"))
Upvotes: 1
Reputation: 3499
I remember how completely confused I was by this when I started using ggplot.
To build on @Mauicio Calvao's answer, use color
inside the aes
to break up the colours in the plot by a variable of data.frame you are plotting eg:
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = drv))
So when color
(or size
or linetype
or similar things) is inside the aes
it's really asking by what object\variable should the colour groups be determined. If this is a string (eg "blue"
) then they are all given the one group, but the name of that group isn't related to the actual colour.
To assign colours once grouped by color
inside the aes
you use scale_color_manual
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = drv))+
scale_colour_manual(values = c("black","blue","orange"))
Upvotes: 2
Reputation: 475
This issue and more specifically the difference in the output from the two mentioned commands are explicitly dealt with in Section 5.4.2 of the 2nd edition of "ggplot2. Elegant graphics for data analysis", by Hadley Wickham himself:
Either:
aes
) a variable of your data to an aesthetic, e.g., aes(..., color = VarX)
, or ...aes
, but inside a geom
element) an aesthetic to a constant value e.g. "blue"In the first case, of mapping an aesthetic, such as color
, ggplot2 chooses a color based on a kind of uniform average of all available colors (at the colorwheel), because the values of the mapped variable are all constant; why should the chosen color coincide with the constant value you happend to choose to map from? More explicitly, if you try the command:
ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y =hwy, color = "foo"))
you get exactly the same output plot as in the first command of the original question.
Upvotes: 9