Adam_S
Adam_S

Reputation: 770

ifelse with substring R

This feels like it should be an easy question, but I have looked here and other places and can't find a simple answer.

I have survey responses and I need to create a 1/0 dummy for source of the response. I am trying to create a simple flag variable by looking through all data in the comment field, and if the substring matches, flag it 1.

Data EG

ID     comment
1      rubber chickens
2      180107 RG - email taken from 2017 graduate survey

I need R to look through the comment field, and anytime it sees the phrase 'graduate survey' to code my grad_svy field as 1, otherwise 0.

When I write

data$grad_svy <- ifelse((substr(data$comment,34,49) == "graduate survey"),1,0) 

It'll run, but it doesn't mark anything as a 1, when in fact there are hundreds of places it should be marking a 1. I know the two letter phrase begins at 34, and ends at 49, for every instance in the field. I am not sure what I'm not doing, the FAQ for ifelse and substring have been pretty unhelpful.

Upvotes: 1

Views: 2854

Answers (2)

C-x C-c
C-x C-c

Reputation: 1311

You may want to use grepl and data.table for things like this. For example:

library(data.table)
setDT(data)
data[, grad_svy := as.numeric(grepl("graduate survey", comment))]

Upvotes: 1

Dave Gruenewald
Dave Gruenewald

Reputation: 5689

You can try this, which uses only base R:

data$grad_svy <- as.numeric(grepl("graduate survey", data$comment))

grepl will return a logical vector if the pattern "graduate survey" is found in data$comment. Then using as.numeric will convert that logical vector into numbers for you: 1 = TRUE, 0 = FALSE

Upvotes: 2

Related Questions