j262byuu
j262byuu

Reputation: 11

Create new variables in R according to two set of varaibles

I have 1 dataframe like this

id  date_0_0    date_1_0    date_2_0    date_3_0    type_0_0    type_1_0    type_2_0    type_3_0
1   5/17/1966   .           .           .           C10         .           .           .
2   3/7/2012    .           .           .           C34         .           .           .
3   .           .           .           .           .           .           .           .
4   5/11/1998   .           .           2/1/1984    C34         .           .           C34

Those date_?_0 are corresponding to type_?_0 and I have around 200 set of those variables. What I need to do is I need to create a new varaible ,lets call it cancer.

If someone did not have any C34 among their type_?_0 variables then their cancer =N/A.

If someone have C34 among their type_?_0 variables then their cancer =corresponding date.

If someone have multiple C34 among their type_?_0 variables then their cancer =their minimumcorresponding date.

So my ideal output is

id  cancer
1   .
2   3/7/2012
3   .
4   2/1/1984

I have to do it in R but I usually use SAS....In SAS I just create three arrays, first one is date, second one is type and third one is cancer, then I loop through date and type array to assign new values for cancer array. Finally I create a new varaible equal to min(cancer array).

Please help. Thanks !!!

Upvotes: 0

Views: 55

Answers (1)

arg0naut91
arg0naut91

Reputation: 14774

Try:

library(tidyverse)

df %>%
  pivot_longer(-id) %>%
  group_by(id, idx = gsub('.*(\\d+)_(\\d+)', '\\1\\2', name))%>%
  mutate(value = case_when(any(value == 'C34') & grepl('\\d+\\/', value) ~ value))%>%
  group_by(id) %>%
  arrange(as.Date(as.character(value), '%d/%m/%Y')) %>%
  select(id, cancer = value) %>%
  slice(1L)

Output:

# A tibble: 4 x 2
# Groups:   id [4]
     id cancer  
  <int> <fct>   
1     1 NA      
2     2 3/7/2012
3     3 NA      
4     4 2/1/1984

Upvotes: 2

Related Questions