HumanityFirst
HumanityFirst

Reputation: 315

how to use labeller() functions to get column totals to appear in the label of a facet when using the facet_grid() function in ggplot2

here's a data set to give context to my question:

library(tidyr); library(dplyr); library(ggplot2)
set.seed(1)
dfr2 <- tibble(x1 = factor(sample(letters[1:3], 50, replace = T), levels=letters[1:3]),
             x2 = factor(sample(letters[1:2], 50, replace = T), levels=letters[1:2]),
             x3 = factor(sample(letters[1:3], 50, replace = T), levels=letters[1:3]),
             grpA = factor(sample(c("grp1","grp2"),50, prob=c(0.3, 0.7) ,replace=T), levels = c("grp1", "grp2")),
             grpB = factor(sample(c("grp1","grp2"),50, prob=c(0.6, 0.4) ,replace=T), levels = c("grp1", "grp2"))
             )

head(dfr2)

here's a function that prepares the data for plotting:


plot_data_prepr <- function(dat, groupvar, mainvar){
  
  groupvar <- sym(groupvar)
  mainvar <- sym(mainvar)
  
  plot_data <- dat %>% 
    group_by(!!groupvar) %>% 
    count(!!mainvar, .drop = F) %>% drop_na() %>% 
    mutate(pct = n/sum(n),
         pct2 = ifelse(n == 0, 0.005, n/sum(n)),
         grp_tot = sum(n),
         pct_lab = paste0(format(pct*100, digits = 1),'%'),
         pct_pos = pct2 + .02)
  
  return(plot_data)
}

here's normal usage of the function:


plot_data_prepr(dat = dfr2, groupvar = "grpA", mainvar = "x1")

My goal is to use a labeller function with facet_grid() to get the 'grp_tot' variable calculated inside the plot_data_prepr() function to be pasted to the correct facet in the facet_grid() call such that the two labels for the facets would end up being 'grp1 (N = 20)' , 'grp2 (N = 30)'.

I can successfully append a string to the factor level:


plusN <- function(string) {
  label <- paste0(string, ' (N = ',')')
  label
}

ggplot(plot_data_prepr(dfr2, "grpA", "x1"),
                 aes(x = x1, y = pct2, fill = x1)) +
      geom_bar(stat = 'identity') +
      ylim(0,1) +
      geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
      facet_grid(. ~ grpA, labeller = labeller(grpA = plusN)) 

but when I try to paste in the evaluated version of the 'grp_tot' variable to the plusN function, it can't find the variable. I think I need to somehow delay the evaluation of 'grp_tot' in the plusN function until it is called inside the facet_grid(), but I'm not sure how to do that:


plusN <- function(string) {
  label <- paste0(string, ' (N = ',eval.parent(grp_tot),')')
  label
}

ggplot(plot_data_prepr(dfr2, "grpA", "x1"),
                 aes(x = x1, y = pct2, fill = x1)) +
      geom_bar(stat = 'identity') +
      ylim(0,1) +
      geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
      facet_grid(. ~ grpA, labeller = labeller(grpA = plusN)) 

I hope someone might be able to help me.

Thanks.

Upvotes: 1

Views: 241

Answers (2)

YBS
YBS

Reputation: 21297

With minimal modification, the following code (only last ggplot)

dd <- plot_data_prepr(dat = dfr2, groupvar = "grpA", mainvar = "x1")

lookup <- unique(dd$grp_tot)

plusN <- function(string) {
  label <- paste0(string, ' (N = ',lookup,')')
  label
}

ggplot(plot_data_prepr(dfr2, "grpA", "x1"),
       aes(x = x1, y = pct2, fill = x1)) +
  geom_bar(stat = 'identity') +
  ylim(0,1) +
  geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
  facet_grid(. ~ grpA, labeller = labeller(grpA = plusN)) 

gives this output:

output

Please note that this works regardless of the number of groups within grpA.

Upvotes: 2

Count Orlok
Count Orlok

Reputation: 1007

I think the cleanest approach for a situation like yours would be to use a lookup table for your labeller instead of a function:

lookup <- c(
  grp1 = "grp1 (N = 20)",
  grp2 = "grp2 (N = 30)"
)

ggplot(plot_data_prepr(dfr2, "grpA", "x1"), aes(x = x1, y = pct2, fill = x1)) +
  geom_bar(stat = 'identity') +
  ylim(0,1) +
  geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
  facet_grid(. ~ grpA, labeller = labeller(grpA = lookup))

If you think your group totals might change in the future, you can also auto-generate the labels by processing the data beforehand and extracting the necessary parts:

data <- plot_data_prepr(dfr2, "grpA", "x1")

lookup <- c(
    grp1 = paste0("grp1 (N = ", data$grp_tot[data$grpA == "grp1"][1], ")"),
    grp2 = paste0("grp2 (N = ", data$grp_tot[data$grpA == "grp2"][1], ")")
)

ggplot(data, aes(x = x1, y = pct2, fill = x1)) + 
  geom_bar(stat = 'identity') +
  ylim(0,1) +
  geom_text(aes(label=pct_lab, y = pct_pos + .02)) +
  facet_grid(. ~ grpA, labeller = labeller(grpA = lookup)) 

Upvotes: 2

Related Questions