Tony
Tony

Reputation: 13

Adding summary row to dplyr output

I found a few solutions on here but none seem to work to add a summary row to dplyr output.

 #mock up data
 df <- data.frame("Market" = sample(c("East", "North", "West"), 100, replace = TRUE, prob = c(0.33, 0.33, 0.34)),
             "var1" = sample(c("Y", "N"), 100, replace = TRUE, prob = c(0.4, 0.6)),
             "var2" = sample(c("Y", "N"), 100, replace = TRUE, prob = c(0.7, 0.3)),
             "var3" = sample(c("Y", "N"), 100, replace = TRUE, prob = c(0.5, 0.5)))

Here is the code:

 df_report <- df %>%
   group_by(Market) %>%
   filter(Market == "East" | Market == "West") %>%
   summarise(n = n(),
        var1_y = sum(var1 == "Y"),
        var1_n = sum(var1 == "N")) %>%
   mutate(total = var1_y + var1_n,
     var1_y_pct = (var1_y/total),
     var1_n_pct = (var1_n/total),
     pct_total = total/sum(total))

Here is the output:

 # A tibble: 2 x 8
   Market     n var1_y var1_n total var1_y_pct var1_n_pct pct_total
   <fct>  <int>  <int>  <int> <int>      <dbl>      <dbl>     <dbl>
 1 East      29     13     16    29      0.448      0.552     0.453
 2 West      35     16     19    35      0.457      0.543     0.547

Here are the two solutions I tried:

Option 1

 df_report %>%
    add_row(Market = "Total", n = sum(n), var1_y = sum(var1_y), var1_n = sum(var1_n), 
 total = sum(total), var1_y_pct = sum(var1_y_pct), var1_n_pct = sum(varn_y_pct), pct_total = sum(pct_total))

Option 2

 df_report %>%
   rbind(c("Total", sum(n), sum(var1_y), sum(var1_n), sum(total), sum(var1_y_pct), sum(varn_y_pct), sum(pct_total)))

Both give me the same error: Error in sum(n) : invalid 'type' (closure) of argument

I'm unable to determine why these solutions, while working for others and seeming very reasonable, are not working for me.

Upvotes: 1

Views: 1123

Answers (1)

Matias Andina
Matias Andina

Reputation: 4230

You should try

df_report %>% janitor::adorn_totals("row")

Which produces

 Market  n var1_y var1_n total var1_y_pct var1_n_pct pct_total
   East 30     11     19    30  0.3666667  0.6333333 0.4285714
   West 40     19     21    40  0.4750000  0.5250000 0.5714286
  Total 70     30     40    70  0.8416667  1.1583333 1.0000000

The long way of doing this is going for summarise (watch out, you have a typo in var1_n_pct). Then bind the rows.

row_to_add <- df_report %>%
  summarise(Market = "Total",
          n = sum(n),
          var1_y = sum(var1_y), 
          var1_n = sum(var1_n), 
          total = sum(total),
          var1_y_pct = sum(var1_y_pct),
          var1_n_pct = sum(var1_n_pct), 
          pct_total = sum(pct_total))

df_report %>% bind_rows(row_to_add)

Upvotes: 0

Related Questions