user338714
user338714

Reputation: 2375

How can a line be overlaid on a bar plot using ggplot2?

I'm looking for a way to plot a bar chart containing two different series, hide the bars for one of the series and instead have a line (smooth if possible) go through the top of where bars for the hidden series would have been (similar to how one might overlay a freq polynomial on a histogram). I've tried the example below but appear to be running into two problems.

First, I need to summarize (total) the data by group, and second, I'd like to convert one of the series (df2) to a line.

df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,1,2,2,3,3))  
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,1,2))  
ggplot(df, aes(x=grp, y=val)) +   
    geom_bar(stat="identity", alpha=0.75) +  
    geom_bar(data=df2, aes(x=grp, y=val), stat="identity", position="dodge")

Upvotes: 15

Views: 84571

Answers (2)

Richie Cotton
Richie Cotton

Reputation: 121077

You can get group totals in many ways. One of them is

with(df, tapply(val, grp, sum))

For simplicity, you can combine bar and line data into a single dataset.

df_all <- data.frame(grp = factor(levels(df$grp)))
df_all$bar_heights <- with(df, tapply(val, grp, sum))
df_all$line_y <- with(df2, tapply(val, grp, sum))

Bar charts use a categorical x-axis. To overlay a line you will need to convert the axis to be numeric.

ggplot(df_all) +
   geom_bar(aes(x = grp, weight = bar_heights)) +
   geom_line(aes(x = as.numeric(grp), y = line_y))

enter image description here

Upvotes: 22

Chase
Chase

Reputation: 69171

Perhaps your sample data aren't representative of the real data you are working with, but there are no lines to be drawn for df2. There is only one value for each x and y value. Here's a modifed version of your df2 with enough data points to construct lines:

df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,2,3,1,2,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,0,2))

p <- ggplot(df, aes(x=grp, y=val)) 
p <- p + geom_bar(stat="identity", alpha=0.75) 

p + geom_line(data=df2, aes(x=grp, y=val), colour="blue")

Alternatively, if your example data above is correct, you can plot this information as a point with geom_point(data = df2, aes(x = grp, y = val), colour = "red", size = 6). You can obviously change the color and size to your liking.

EDIT: In response to comment

I'm not entirely sure what the visual for a freq polynomial over a histogram is supposed to look like. Are the x-values supposed to be connected to one another? Secondly, you keep referring to wanting lines but your code shows geom_bar() which I assume isn't what you want? If you want lines, use geom_lines(). If the two assumptions above are correct, then here's an approach to do that:

 #First let's summarise df2 by group
 df3 <- ddply(df2, .(grp), summarise, total = sum(val))
>  df3
  grp total
1   A     5
2   B     8
3   C     3

#Second, let's plot df3 as a line while treating the grp variable as numeric

p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(alpha=0.75, stat = "identity") 
p + geom_line(data=df3, aes(x=as.numeric(grp), y=total), colour = "red")

Upvotes: 14

Related Questions