Triparna Poddar
Triparna Poddar

Reputation: 428

convert one variable into two variables by multiple conditions in R

My dataframe looks like this:

df:
  patID   D1   D2   D3  Code Day Month Year
1    11 V581   G8 V581  5468   9     7 2017
2    12 V581 V588 V588 0905F   8     8 2017
3    13 V582 B678 B678 J8907   9     5 2016
4    14  G67 A568 A568 J8907   7     9 2016
5    15 V583 V584 V584 J9056   8     5 2016
6    16 V584 V586 V586 G0890   5     4 2017
7    17  G69 G789 A567 A0891   4    10 2017

I have to take cases if from D1, D2, D3 combined matches starting with "V58". Also I need to take values from Code but split them into variables depending on certain conditions ie CodeA if ("J8907","J9056") and CodeB if ("0905F", "G0890").

The output should look like this:

  patID Diags CodeA CodeB     Date
1    11  V581  <NA>  <NA> 7/9/2017
2    12  V581 0905F  <NA> 8/8/2017
3    13  V582  <NA> J8907 5/9/2016
4    14  <NA>  <NA> J8907 9/7/2016
5    15  V583  <NA> J9056 5/8/2016
6    16  V584 G0890  <NA> 4/5/2017
7    12  V588 0905F  <NA> 8/8/2017
8    13  <NA>  <NA> J8907 5/9/2016
9    14  <NA>  <NA> J8907 9/7/2016
10   15  V584  <NA> J9056 5/8/2016
11   16  V586 G0890  <NA> 4/5/2017
12   11  V581  <NA>  <NA> 7/9/2017
13   12  V588 0905F  <NA> 8/8/2017
14   13  <NA> J8907 J8907 5/9/2016
15   14  <NA>  <NA> J8907 9/7/2016
16   15  V584  <NA> J9056 5/8/2016
17   16  V586 G0890  <NA> 4/5/2017

I tried but this is not working(and I took this long format approach because in original data I have similar sets of variables like D1:D3 and needs to handled similarly)

datnew <- bind_rows(
  dat %>% select(patID,Code:Year,ends_with("1")) %>% rename_all(str_remove, "1"),
  dat %>% select(patID,Code:Year,ends_with("2")) %>% rename_all(str_remove, "2"),
  dat %>% select(patID,Code:Year,ends_with("3")) %>% rename_all(str_remove, "3")
) %>%
  transmute(
    patID,
    diags = D,
    CodeA =Code,
    CodeB = Code,
    Dates = as.Date(paste0(Day,"/",Month, "/", Year),format="%d/%m/%Y")
  )%>%
  filter(
     grepl("V58",diags)|
     CodeA  %in%    c("0905F","G0890")|
     CodeB %in% c("J8907","J9056")
  )

Thanks in Advance!!

Upvotes: 1

Views: 78

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389175

You can try :

library(dplyr)
library(tidyr)

df %>%
  unite(Date, Day, Month, Year, sep = '/') %>%
  pivot_longer(cols = D1:D3, values_to = 'Diags') %>%
  mutate(Diags = replace(Diags, -grep('V58', Diags), NA), 
         CodeA = ifelse(Code %in% c("J8907","J9056"), Code, NA),
         CodeB = ifelse(Code %in% c("0905F", "G0890"), Code, NA)) %>%
  filter(!(is.na(Diags) & is.na(CodeA) & is.na(CodeB))) %>%
  select(-Code, -name)

# A tibble: 17 x 5
#   patID Date     Diags CodeA CodeB
#   <int> <chr>    <chr> <chr> <chr>
# 1    11 9/7/2017 V581  NA    NA   
# 2    11 9/7/2017 V581  NA    NA   
# 3    12 8/8/2017 V581  NA    0905F
# 4    12 8/8/2017 V588  NA    0905F
# 5    12 8/8/2017 V588  NA    0905F
# 6    13 9/5/2016 V582  J8907 NA   
# 7    13 9/5/2016 NA    J8907 NA   
# 8    13 9/5/2016 NA    J8907 NA   
# 9    14 7/9/2016 NA    J8907 NA   
#10    14 7/9/2016 NA    J8907 NA   
#11    14 7/9/2016 NA    J8907 NA   
#12    15 8/5/2016 V583  J9056 NA   
#13    15 8/5/2016 V584  J9056 NA   
#14    15 8/5/2016 V584  J9056 NA   
#15    16 5/4/2017 V584  NA    G0890
#16    16 5/4/2017 V586  NA    G0890
#17    16 5/4/2017 V586  NA    G0890

Upvotes: 1

Related Questions