Joran de Bock
Joran de Bock

Reputation: 23

How can you run a piece of code on different subsets in R with dplyr

I have a dataset that I have to modify. I want to run a piece of different code for different subsets in the dplyr pipeline. The data handles glycaemia values in patients in ICU. It looks like this:

dfavg <- df %>%
  group_by(patientid) %>%
  mutate(icuoutcome = ifelse(row_number() != n(), 0, icuoutcome)) %>%
  mutate(strata = days) %>%
  mutate(survtime = max(days)-days) %>%
  group_by(days, add = TRUE) %>%
  mutate(gly_mean = mean(glycaemia))

However, I need this code only for the first 5 days a patient is in the ICU. After this I need another code to run for days 6 to 15. I tried using filter(days<=5) but then I lose all other data. How can I make my code so that the upper code runs for day 1-5, another code for 6-15, but all in the same dataset or same pipeline. I also thought of unfiltering but I don't think that's possible and also of using group_by with a condition (like days<=5).

Thank you in advance

Upvotes: 1

Views: 97

Answers (1)

akrun
akrun

Reputation: 887501

We can split the data based on the 'days' and apply the list of corresponding functions on the list output

library(dplyr)
library(purrr)
split(df, df$days > 5) %>%
    map2(funslist, ~ .y(.x))

Using a small reproducible example

data(mtcars)
split(mtcars$mpg, mtcars$vs) %>% 
     map2_dbl(list(mean, max), ~ .y(.x))

Upvotes: 2

Related Questions