Reputation: 55
I have data saved in multiple datasets, each consisting of four variables. Imagine something like a data.table dt
consisting of the variables Country
, Male/Female
, Birthyear
, Weighted Average Income
. I would like to create a graph where you see only one country's weighted average income by birthyear and split by male/female. I've used the facet_grid()
function to get a grid of graphs for all countries as below.
ggplot() +
geom_line(data = dt,
aes(x = Birthyear,
y = Weighted Average Income,
colour = 'Weighted Average Income'))+
facet_grid(Country ~ Male/Female)
However, I've tried isolating the graphs for just one country, but the below code doesn't seem to work. How can I subset the data correctly?
ggplot() +
geom_line(data = dt[Country == 'Germany'],
aes(x = Birthyear,
y = Weighted Average Income,
colour = 'Weighted Average Income'))+
facet_grid(Country ~ Male/Female)
Upvotes: 0
Views: 1376
Reputation: 8572
For your specific case the problem is that you are not quoting Male/Female
and Weighted Average Income
. Also your data and basic aesthetics should likely be part of ggplot
and not geom_line
. Doing so isolates these to the single layer, and you would have to add the code to every layer of your plot if you were to add for example geom_smooth
.
So to fix your problem you could do
library(tidyverse)
plot <- ggplot(data = dt[Country == 'Germany'],
aes(x = Birthyear,
y = sym("Weighted Average Income"),
col = sym("Weighted Average Income")
) + #Could use "`x`" instead of sym(x)
geom_line() +
facet_grid(Country ~ sym("Male/Female")) ##Could use "`x`" instead of sym(x)
plot
Now ggplot2
actually has a (lesser known) builtin functionality for changing your data, so if you wanted to compare this to the plot with all of your countries included you could do:
plot %+% dt # `%+%` is used to change the data used by one or more layers. See help("+.gg")
Upvotes: 1