Awinl
Awinl

Reputation: 23

Check whether string is present in multiple columns and return true/false column

Dataset similar to:

Day   Age    Med1, Med2, Med3, Med4 ... Med 18
 M     42     En    Fu    Gt    Di       Mo
 F     36     Gt    Pa    Mo    En       Mo
 F     89     Gt    Pa    Pa    En       Na      

I want to add a column returning True/False if, for example, 'Mo' is present in the Med1 - Med18 columns.

Day   Age    Med1, Med2, Med3, Med4 ... Med 18  MoGiven
 M     42     En    Fu    Gt    Di       Mo        T
 F     36     Gt    Pa    Mo    En       Mo        T
 F     89     Gt    Pa    Pa    En       Na        F

Have looked at various other threads and closest solution seems to be:

raw <- imported_data %>%
  mutate_at(var(contains("Med")), funs(.=='Mo')

but am getting error no applicable method for 'tbl_vars' applied to an object of class "quosures"

Would like to avoid doing a nested ifelse for each column if possible, unless this is only way.

Am I on the right lines?

Upvotes: 2

Views: 2409

Answers (1)

MKR
MKR

Reputation: 20095

You can compare all columns of data.frame with text to get a matrix of TRUE/FALSE. Now, one can use rowSums to get the count of matching columns in each row. If count is equal or more than 1 than MoGiven will be TRUE otherwise FALSE:

The base-R function startsWith can be used to get the subset only column starting with Med.

df$MoGiven <- (rowSums(df[,startsWith(names(df),"Med")]=="Mo") >= 1)

#Modified df 
df
#   Day Age Med1 Med2 Med3 Med4 Med18 MoGiven
# 1   M  42   En   Fu   Gt   Di    Mo    TRUE
# 2   F  36   Gt   Pa   Mo   En    Mo    TRUE
# 3   F  89   Gt   Pa   Pa   En    Na   FALSE

Data:

df <- read.table(text =
"Day   Age    Med1 Med2 Med3 Med4  Med18
M     42     En    Fu    Gt    Di       Mo
F     36     Gt    Pa    Mo    En       Mo
F     89     Gt    Pa    Pa    En       Na",
header = TRUE, stringsAsFactors = FALSE)

Upvotes: 4

Related Questions