Reputation: 519
I am trying to get a summary of how many people in my data have had surgery and then gone on to die; to calculate the mortality rate for surgery patients.
My data looks like this
df <- data.frame(
y1988 = rep(c('Y', 'Y', 'Y', 'M', 'D', 'Y', 'Y', 'D', 'X', 'D'), 25),
y1989 = rep(c('Y', 'M', 'D', 'Y', 'X', 'Y', 'X', 'Y', 'Y', 'Y'), 25),
y1990 = rep(c('D', 'Y', 'D', 'X', 'Y', 'M', 'D', 'Y', 'Y', 'Y'), 25),
y1991 = rep(c('D', 'Y', 'Y', 'M', 'D', 'Y', 'Y', 'X', 'D', 'Y'), 25),
age = rep(20:69, 5),
ID = (1:250)
)
What I want to do is get a sum of the number of 'D' and divide this by the number of 'Y' for age per year (y1988 to y1991).
If I were to do this manually, I would subset the dataframe for each age, and then divide the sum of 'D' by the sum of 'Y', eg
a21 <- filter(df, age == 21)
a21$mort1988 <- sum(a21$y1988 == 'D') / sum(a21$y1988 == 'Y')
a21$mort1989 <- sum(a21$y1989 == 'D') / sum(a21$y1989 == 'Y')
etc
This seems absurd, is there an efficient way to do this?
Upvotes: 2
Views: 306
Reputation: 887621
We can use summarise_at
to do the division for each of the yYear
columns after grouping by 'age'
df %>%
group_by(age) %>%
summarise_at(vars(matches("y\\d{4}")), funs(sum(.=="D")/sum(.=="Y")))
Upvotes: 3