Reputation: 853
I am trying to plot observations and their grouped regression lines with ggplot
as follows:
ggplot(df, aes(x = cabpol.e, y = pred.vote_share, color = coalshare)) +
geom_point() +
scale_color_gradient2(midpoint = 50, low="blue", mid="green", high="red") +
geom_smooth(aes(x = cabpol.e, y = pred.vote_share, group=coalshare1, fill = coalshare1), se = FALSE, method='lm') +
scale_fill_manual(values = c(Junior="blue", Medium="green", Senior="red"))
The problem is that the lines from geom_smooth
are all the same color. I tried using scale_fill_manual
so that there aren't two different color scales, and manually determining which color corresponds to each group. but instead all the lines appear blue. How can I make each line a different color?
As requested, here is a set of replicable data with the same problem:
set.seed(1000)
dff <- data.frame(x=rnorm(100, 0, 1),
y=rnorm(100, 1, 2),
z=seq(1, 100, 1),
g=rep(c("A", "B"), 50))
ggplot(dff, aes(x = x, y = y, color = z, group = g, fill = g)) +
geom_point() +
scale_color_gradient2(midpoint = 50, low="blue", high="red") +
geom_smooth(se = FALSE, method='lm')
Upvotes: 2
Views: 3421
Reputation: 853
Group-trends between the x and y variables can be plotted by using different dataframes for the geom_line
(with predicted values) and geom_point
(with raw data) functions. Make sure to determine in the ggplot()
function that color is always the same variable, and then for geom_line
group by the same variable.
p2 <- ggplot(NULL, aes(x = cabpol.e, y = vote_share, color = coalshare)) +
geom_line(data = preds, aes(group = coalshare, color = coalshare), size = 1) +
geom_point(data = df, aes(x = cabpol.e, y = vote_share)) +
scale_color_gradient2(name = "Share of Seats\nin Coalition (%)",
midpoint = 50, low="blue", mid = "green", high="red") +
xlab("Ideological Differences on State/Market") +
ylab("Vote Share (%)") +
ggtitle("Vote Share Won by Coalition Parties in Next Election")
Upvotes: 1
Reputation: 3060
My solution to this problem would be to create multiple geom_smooth calls, and each time subset the data for the desired factor level. This way you are able to pass a different color to each call of geom_smooth. As long as you do not have many factors, this solution is not terribly inefficient.
dff <- data.frame(x=rnorm(100, 0, 1),
y=rnorm(100, 1, 2),
z=seq(1, 100, 1),
g=rep(c("A", "B"), 50))
ggplot(dff, aes(x = x, y = y,
color = z,
group = g)) +
geom_point() +
scale_color_gradient2(midpoint = 50, low="blue", high="red") +
geom_smooth(
aes(x = x, y =y),
color = "red",
method = "lm",
data = filter(dff, g == "A"),
se = FALSE
) +
geom_smooth(
aes(x = x, y =y),
color = "blue",
method = "lm",
data = filter(dff, g == "B"),
se = FALSE
)
Upvotes: 2