Reputation: 2375
I'm looking for a way to plot a bar chart containing two different series, hide the bars for one of the series and instead have a line (smooth if possible) go through the top of where bars for the hidden series would have been (similar to how one might overlay a freq polynomial on a histogram). I've tried the example below but appear to be running into two problems.
First, I need to summarize (total) the data by group, and second, I'd like to convert one of the series (df2) to a line.
df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,1,2,2,3,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,1,2))
ggplot(df, aes(x=grp, y=val)) +
geom_bar(stat="identity", alpha=0.75) +
geom_bar(data=df2, aes(x=grp, y=val), stat="identity", position="dodge")
Upvotes: 15
Views: 84571
Reputation: 121077
You can get group totals in many ways. One of them is
with(df, tapply(val, grp, sum))
For simplicity, you can combine bar and line data into a single dataset.
df_all <- data.frame(grp = factor(levels(df$grp)))
df_all$bar_heights <- with(df, tapply(val, grp, sum))
df_all$line_y <- with(df2, tapply(val, grp, sum))
Bar charts use a categorical x-axis. To overlay a line you will need to convert the axis to be numeric.
ggplot(df_all) +
geom_bar(aes(x = grp, weight = bar_heights)) +
geom_line(aes(x = as.numeric(grp), y = line_y))
Upvotes: 22
Reputation: 69171
Perhaps your sample data aren't representative of the real data you are working with, but there are no lines to be drawn for df2
. There is only one value for each x and y value. Here's a modifed version of your df2
with enough data points to construct lines:
df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,2,3,1,2,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,0,2))
p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(stat="identity", alpha=0.75)
p + geom_line(data=df2, aes(x=grp, y=val), colour="blue")
Alternatively, if your example data above is correct, you can plot this information as a point with geom_point(data = df2, aes(x = grp, y = val), colour = "red", size = 6)
. You can obviously change the color and size to your liking.
EDIT: In response to comment
I'm not entirely sure what the visual for a freq polynomial over a histogram is supposed to look like. Are the x-values supposed to be connected to one another? Secondly, you keep referring to wanting lines but your code shows geom_bar()
which I assume isn't what you want? If you want lines, use geom_lines()
. If the two assumptions above are correct, then here's an approach to do that:
#First let's summarise df2 by group
df3 <- ddply(df2, .(grp), summarise, total = sum(val))
> df3
grp total
1 A 5
2 B 8
3 C 3
#Second, let's plot df3 as a line while treating the grp variable as numeric
p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(alpha=0.75, stat = "identity")
p + geom_line(data=df3, aes(x=as.numeric(grp), y=total), colour = "red")
Upvotes: 14