Lutz
Lutz

Reputation: 223

ggplot2 put labels on a stacked bar chart

I want to label the output of my function call (which is the sum of length per month) on top of each bar.

I have tried to store the desired numbers in a vector and use this as a label but this did not work.

Here is my example code

library(ggplot2)

month<-c(1,3,2,4,3,10,12,4,9,5,6,6,7,9,9,8,10,9,11,12,9)
length<-c(2,3.5,4,10,14,16,20,34,10.5,2,10.4,3.4,4,5,6,12,5,34,5.6,56.5,22)
year<-c(2019,2018,2018,2017,2018,2016,2016,2017,2018,2019,2016,2017,2017,2018,2019,2016,2017,2018,2019,2016,2019)

df<-data.frame(month,length,year)


ggplot(df) +
  geom_bar(aes(month, length, fill = as.factor(year)), 
           position = "stack", stat = "summary", fun.y = "sum")+
  scale_x_continuous(breaks = seq(1,12,by = 1))

Is there any way to use the output of fun.y = "sum" directly as the geom_text() label?

Upvotes: 3

Views: 2187

Answers (2)

eastclintw00d
eastclintw00d

Reputation: 2364

I don't know the answer to your question if the summary result can be used directly for geom_text. But I propose another solution to your problem:

library(ggplot2)
library(dplyr)

month<-c(1,3,2,4,3,10,12,4,9,5,6,6,7,9,9,8,10,9,11,12,9)
length<-c(2,3.5,4,10,14,16,20,34,10.5,2,10.4,3.4,4,5,6,12,5,34,5.6,56.5,22)
year<-c(2019,2018,2018,2017,2018,2016,2016,2017,2018,2019,2016,2017,2017,2018,2019,2016,2017,2018,2019,2016,2019)

df<-data.frame(
  year = as.factor(year),
  month = as.factor(month),
  length
)

df %>% 
  group_by(year, month) %>% 
  summarise(length = sum(length)) %>% 
  arrange(month, desc(year)) %>%
  plyr::ddply("month", transform, label_pos = cumsum(length) - .5 * length) %>% ## calculate label offset
  ggplot(aes(month, length)) +
  geom_bar(aes(fill = year), position = "stack", stat = "identity") +
  geom_text(aes(label = length, y = label_pos))

enter image description here


If you want percentages summing up to 100% per month you can use the scales package

df %>% 
  group_by(year, month) %>% 
  summarise(length = sum(length)) %>% 
  group_by(month) %>% 
  mutate(perc = scales::percent(round(length / sum(length), 3))) %>% 
  arrange(month, desc(year)) %>%
  plyr::ddply("month", transform, label_pos = cumsum(length) - .5 * length) %>% ## calculate label offset
  ggplot(aes(month, length)) +
  geom_bar(aes(fill = year), position = "stack", stat = "identity") +
  geom_text(aes(label = perc, y = label_pos))

enter image description here

Upvotes: 3

liborm
liborm

Reputation: 2724

As per the docs:

... If you want the heights of the bars to represent values in the data, use geom_col() instead. ...

So your results can be reproduced with much cleaner code (I also took the liberty to convert the apparent factors).

library(tidyverse)

month <- c(1,3,2,4,3,10,12,4,9,5,6,6,7,9,9,8,10,9,11,12,9)
length <- c(2,3.5,4,10,14,16,20,34,10.5,2,10.4,3.4,4,5,6,12,5,34,5.6,56.5,22)
year <- c(2019,2018,2018,2017,2018,2016,2016,2017,2018,2019,2016,2017,2017,2018,2019,2016,2017,2018,2019,2016,2019)

data.frame(month,length,year) %>% 
  mutate(
    month = as.factor(month),
    year = as.factor(year)) ->
  df

df %>% 
  ggplot() +
  geom_col(aes(month, length, fill = year))

Using the stat = is always a pain in ggplot, so it's easier to pre-compute the stats using the awesome dplyr verbs.

df %>% 
  group_by(month) %>% 
  mutate(monthly = sum(length)) %>% 
  ggplot() +
  geom_col(aes(month, length, fill = year)) +
  geom_text(aes(month, monthly, label = monthly),
            vjust = -1) +
  ylim(0, 90)

The quirk of this approach is that it prints some of the labels multiple times on top of each other. You can create a separate dataset to get rid of this.

df %>% 
  ggplot() +
  geom_col(aes(month, length, fill = year)) +
  geom_text(aes(month, monthly, label = monthly),
            vjust = -1,
            data = . %>% group_by(month) %>% summarise(monthly = sum(length))) +
  ylim(0, 90)

I used . in place of the data frame reference, so you need to replace only one instance of df if you want to use different dataset.

enter image description here

Upvotes: 4

Related Questions