Triparna Poddar
Triparna Poddar

Reputation: 438

subset a dataframe on multiple conditions in R

My dataset has several variables and I want to build a subset as well as create new variables based on those conditions

dat1 
S1 S2 H1 H2 Month1 Year1 Month2 Year2
16 17 81 70  09   2017   07      2017
17 16 80 70  08   2017   08      2016
14 16 81 81  09   2016   05      2016
18 15 70 81  07   2016   09      2017
17 16 80 80  08   2016   05      2016
18 18 81 70  05   2017   04      2016

I want to subset such that if S1=16,17,18 and H1=81,80 then I create a new variable Hist=H1 , date=paste(Month1,Year1) Sip = S1 Same goes for set of S2, H2 . My output should be: [ The first 4 rows comes for sets of S1,H1, Month1,Year2 and last 2 rows comes from S2,H2,Month2,Year2

Hist Sip Date
81   16  09-2017
80   17  08-2017
80   17  08-2016
81   18  05-2017
81   16  05-2016
80   16  05-2016

My Code :

datnew <- dat1 %>%
 mutate(Date=ifelse((S1==16|S1==17|S1=18)&(H1==80|H1==81),paste(01,Month1,Year1,sep="-"),
                      ifelse((S2==16|S2==17|S2==18)&(H2==80|H2==81),paste(Month2,Year2,sep="-"),"NA")),
      hist=ifelse((S1==16|S1==17|S1=18)&(H1==80|H1==81),H1,
                  ifelse((S2==16|S2==17|S2==18)&(H2==80|H2==81),H2,"NA")),
      sip=ifelse((S1==16|S1==17|S1=18)&(H1==80|H1==81),S1,
                  ifelse((S2==16|S2==17|S2==18)&(H2==80|H2==81),S2,"NA"))) 

In the original data I have 10 sets of such columns ie S1-S10, H1-H10, Month1_-Month10... And for each variable I have lot more conditions of numbers. In this method it is going on and on. Is there any better way to do this?

Thanks in advance

Upvotes: 1

Views: 211

Answers (1)

Paul
Paul

Reputation: 9107

Here is a tidyverse solution. Separate into two data frames and bind the rows together.

library(tidyverse)
  
bind_rows(
  dat1 %>% select(patientId, ends_with("1")) %>% rename_all(str_remove, "1"),
  dat1 %>% select(patientId, ends_with("2")) %>% rename_all(str_remove, "2")
) %>%
  transmute(
    patientId,
    Hist = H,
    Sip = S,
    date = paste0(Month, "-", Year)
  ) %>%
  filter(
    Sip %in% 16:18,
    Hist %in% 80:81
  )
#> # A tibble: 6 x 4
#>    patientId  Hist   Sip date   
#>        <int> <dbl> <dbl> <chr>  
#> 1          1    81    16 09-2017
#> 2          2    80    17 08-2017
#> 3          5    80    17 08-2016
#> 4          6    81    18 05-2017
#> 5          3    81    16 05-2016
#> 6          5    80    16 05-2016

Upvotes: 2

Related Questions