LDT
LDT

Reputation: 3088

case_when with multiple conditions in dplyr R

I have a data.frame that looks like this

df <-data.frame(Day=c(0,0,0,1,1,1),type=c("tr1","tr2","ctrl","tr1","tr2","ctrl"),
                mean=c(0.211,0203,0.199,0.119,0.001,0.254), 
                sd=c(0.07,0.141,0.096, 0.0848, 0.0006, 0.0474))

  Day type    mean     sd
1   0  tr1   0.211 0.0700
2   0  tr2 203.000 0.1410
3   0 ctrl   0.199 0.0960
4   1  tr1   0.119 0.0848
5   1  tr2   0.001 0.0006
6   1 ctrl   0.254 0.0474

First I want to group my dataframe based on the Day aka group_by(Day). When in each group, the sum(mean + sd) of each type (tr1, tr2) is bigger than the difference(mean - sd) of the control (ctrl) then I want in the new column (new.col), to assign the value ~yes and if not I want to assign the value ~no.

For example I want my data to look somehow like this. It does not have to look like this

  Day type    mean     sd new.col
1   0  tr1   0.211 0.0700  yes
2   0  tr2 203.000 0.1410  yes
3   0 ctrl   0.199 0.0960  NA
4   1  tr1   0.119 0.0848  NO
5   1  tr2   0.001 0.0006  N0
6   1 ctrl   0.254 0.0474  NA

Upvotes: 3

Views: 7765

Answers (2)

DPH
DPH

Reputation: 4344

one alternative with dplyr can be this:

library(dplyr)

df %>% 
  dplyr::left_join(df %>% dplyr::filter(type == "ctrl"), by = "Day", suffix = c("_t", "_c")) %>%
  dplyr::group_by(Day, type_t) %>%
  dplyr::mutate(new.col = case_when(type_t == "ctrl" ~ NA_character_,
                                   sum(mean_t + sd_t) > (mean(mean_c -sd_c)) ~ "yes",
                                   TRUE ~ "no")) %>%
  dplyr::ungroup() %>%
  dplyr::select(Day, type = type_t, mean = mean_t, sd = sd_t, new.col)

# A tibble: 6 x 5
    Day type     mean     sd new.col
  <dbl> <chr>   <dbl>  <dbl> <chr>  
1     0 tr1     0.211 0.07   yes    
2     0 tr2   203     0.141  yes    
3     0 ctrl    0.199 0.096  NA     
4     1 tr1     0.119 0.0848 no     
5     1 tr2     0.001 0.0006 no     
6     1 ctrl    0.254 0.0474 NA  

Upvotes: 2

akrun
akrun

Reputation: 887048

After grouping by 'Day', one option is to subset the 'mean', 'sd' values where the 'type' is not (!=) "ctrl", add (+) the columns, get the sum, check if it greater (>) than the corresponding added values of of 'mean', 'sd' where 'type' is 'ctrl'. Convert the logical to numeric index by adding 1, use that for replacing with a vector of values (c("NO", "Yes")). Finally change the rows where 'type' is "ctrl" to NA with case_when

library(dplyr)
df %>% 
    group_by(Day) %>% 
    mutate(new.col = case_when(type == "ctrl" ~ NA_character_, 
     TRUE ~ c("NO", "Yes")[1 + (sum(mean[type != "ctrl"] + 
      sd[type != "ctrl" ]) >  (mean[type == 'ctrl'] - sd[type == 'ctrl']))])) %>%
    ungroup

-output

# A tibble: 6 x 5
    Day type     mean     sd new.col
  <dbl> <chr>   <dbl>  <dbl> <chr>  
1     0 tr1     0.211 0.07   Yes    
2     0 tr2   203     0.141  Yes    
3     0 ctrl    0.199 0.096  <NA>   
4     1 tr1     0.119 0.0848 NO     
5     1 tr2     0.001 0.0006 NO     
6     1 ctrl    0.254 0.0474 <NA>   

Upvotes: 4

Related Questions