HSC
HSC

Reputation: 129

In R create binary columns conditional on character columns

disorder <- c("depression","adhd","anxiety","bipolar",NA)
keywords <- c("depression | depressive", "adhd","anxiety","bi","n/a")
df1 <- as.data.frame(cbind(disorder,keywords))

survey <- c("depression adhd",
        "bipolar disorder",
        "bi  adhd",
        "adhd  anxiety",
        "depressive",
        "adhd bi",
        "n/a")
df2 <- as.data.frame(survey)
df2$depression <- ifelse(str_detect(df2$survey,df1$keywords[1]),"yes","no")
df2$adhd <- ifelse(str_detect(df2$survey,df1$keywords[2]),"yes","no")
df2$anxiety <- ifelse(str_detect(df2$survey, df1$keywords[3]),"yes","no")
df2$bipolar <- ifelse(str_detect(df2$survey, df1$keywords[4]),"yes","no")
df2$na <-  ifelse(str_detect(df2$survey, df1$keywords[5]),"yes","no")
df2

                sx depression adhd anxiety bipolar  na
1  depression adhd        yes  yes      no      no  no
2 bipolar disorder         no   no      no     yes  no
3         bi  adhd         no  yes      no     yes  no
4    adhd  anxiety         no  yes     yes      no  no
5       depressive         yes   no     no      no  no [edited] it should be yes
6          adhd bi         no  yes      no     yes  no
7              n/a         no   no      no      no yes

I am trying to match with survey and keyword so that I can list up as above. Can I do that with any kind of loop? I have long list of disorders, so really want to make a replicable code instead of doing manually.

Upvotes: 2

Views: 37

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388962

Remove whitespace from keywords column in df1.

df1 <- transform(df1,  keywords = gsub('\\s', '', keywords))

Using tidyverse you can do :

library(tidyverse)

result <- bind_cols(df2, map_dfc(df1$key, 
                         ~ifelse(str_detect(df2$sx,.x),"yes","no"))) %>%
          rename_with(~df1$key, -1)
result

#            survey depression|depressive adhd anxiety  bi n/a
#1  depression adhd                   yes  yes      no  no  no
#2 bipolar disorder                    no   no      no yes  no
#3         bi  adhd                    no  yes      no yes  no
#4    adhd  anxiety                    no  yes     yes  no  no
#5       depressive                   yes   no      no  no  no
#6          adhd bi                    no  yes      no yes  no
#7              n/a                    no   no      no  no yes

In base R you could do it with lapply :

df2[df1$key] <- lapply(df1$keywords, function(x) 
                       ifelse(grepl(x, df2$survey), 'yes','no'))
df2

Upvotes: 2

akrun
akrun

Reputation: 887068

We can do this without ifelse

df2[df1$key] <- lapply(df1$key, function(x) c("no", "yes")[grepl(x, df$sx) + 1])

Upvotes: 0

Related Questions