R: Summation of Single Column Based on Conditions

Question

I have a dataset that has (with the help of this site) created a count based on 3 instances:

Number of Plans where Coins is greater than 20
Number of Plans where Coins is equal to 20
Number of Plans where Coins is less than 20

Here's a sample from the much larger dataset

Plan    Year   Coins   Copay   Enrollment
  A     2018     20      10       200
  B     2014     15       5       100
  C     2012     30       0        50
  D     2017     30      10       350
  E     2018     5       10       400
  F     2018     20       0       150
  G     2018     20       0       200
  H     2016     20      10       800
  I     2014     10       3       250
  J     2017     20       7       550

This was achieved through:

df %>%
   group_by(grp = case_when(Coins < 20 ~ 'grp1', Coins ==20 ~ 'grp2', TRUE ~ 'grp3')) %>%
   summarise(Plan = toString(unique(Plan)), prop = n()) %>%
   ungroup %>%
   select(-grp)

To get:

C, D - 2
A, F, G, H, J - 5
B, E, I - 3

I would now like to involve the Enrollment column and sum based on the three conditions above. Thus, I would like to achieve the following counts:

400 (50+350)
1900 (200+150+200+800+550)
750 (100+400+250)

Maurits Evers · Accepted Answer

You can add a counts = sum(Enrollment) summary prior to ungrouping.

library(tidyverse);
df %>%
    group_by(grp = case_when(Coins < 20 ~ 'grp1', Coins ==20 ~ 'grp2', TRUE ~ 'grp3')) %>%
    group_by(grp) %>%
    summarise(
        Plan = toString(unique(Plan)),
        prop = n(),
        counts = sum(Enrollment)) %>%
    ungroup() %>%
    select(-grp);
## A tibble: 3 x 4
#  Plan           prop counts
#             
#1 B, E, I           3    750
#2 A, F, G, H, J     5   1900
#3 C, D              2    400

Sample data

df <- read.table(text =
    "Plan    Year   Coins   Copay   Enrollment
  A     2018     20      10       200
  B     2014     15       5       100
  C     2012     30       0        50
  D     2017     30      10       350
  E     2018     5       10       400
  F     2018     20       0       150
  G     2018     20       0       200
  H     2016     20      10       800
  I     2014     10       3       250
  J     2017     20       7       550", header = T)

R: Summation of Single Column Based on Conditions

Answers (1)

Sample data

Related Questions