Guilherme
Guilherme

Reputation: 332

Sum only values from group inside ggplot function

Is there a way in ggplot to sum the values only from the group I'm choosing?

I have this data.frame:

      table age       n
1      base   0     122
2      base   1   20043
3      base   2 1057146
4      base   3 1787504
5      base   4 2211046
6  sample_1   1      50
7  sample_1   2    2478
8  sample_1   3    4186
9  sample_1   4    5161
10 sample_2   1      41
11 sample_2   2    2492
12 sample_2   3    4351
13 sample_2   4    5288

I want to plot the proportion by age within each table (column table). I've tried to sum the column n inside the ggplot function, but it sums the value from all tables:

df  %>% 
  ggplot(aes(x = age, y = n/sum(n), group = table)) +
  geom_bar(stat = 'identity') +
  facet_grid(. ~ table)

The solution I found was treating the table before making the plot:

df %>% group_by(table) %>% mutate(prop = n/sum(n)) %>% 
  ggplot(aes(x = age, y = prop, group = table)) +
  geom_bar(stat = 'identity') +
  facet_grid(. ~ table)

Is it possible to do it directly in the ggplot function?

Upvotes: 0

Views: 451

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269371

Set y to the proportions and optionally set the scale labels to percent.

library(ggplot2)
library(scales)

ggplot(df, aes(x = age, y = ave(n, table, FUN = prop.table), group = table)) +  
        geom_bar(stat = "identity") + 
        scale_y_continuous(labels = percent) +
        facet_grid(. ~ table) +
        ylab("percent")

Upvotes: 1

Related Questions