Reputation: 193
I know this question has been asked before and I've been trying to adapt the logic to my situation, but I'm not sure what I'm doing that's wrong.
I have a dataframe where I'm trying to create a new True/False column based on whether an element in another column has a string I'm searching for.
cpt <- data.frame(value = c("62267", "62268", "62269"))
ex <- data.frame(code = c("2456", "62267", "6200", "62268", "63001", "62269"))
where I want a true when a string in ex equals one of the strings in cpt.
I've tried this:
cpt1 <- paste(cpt, collapse = '|')
setDT(ex)[,i4 := str_extract(ex$code, cpt)]
and
setDT(ex)[,i3 := sapply(cpt1, grepl, ex$code)]
and
setDT(ex)[,i2 := any(grep(cpt1,ex$code))]
but my "i" column always comes out as NULL. I'd like to keep it using the data.table package since I have chains following this snippet of code. I'm not sure what I'm doing wrong? Any help/advice would be greatly appreciated!
Upvotes: 0
Views: 655
Reputation: 6489
The TRUE/FALSE column could also be generated using the function %chin%
in data.table
package. It basically checks whether each element (string) in its left-hand side appears in its right-hand side.
setDT(ex)[, i := code %chin% cpt$value]
# code i
# 1: 2456 FALSE
# 2: 62267 TRUE
# 3: 6200 FALSE
# 4: 62268 TRUE
# 5: 63001 FALSE
# 6: 62269 TRUE
Upvotes: 1
Reputation: 887078
We need to create the pattern from a vector
instead of a data.frame
i.e. extract the column 'value' and paste
library(data.table)
library(stringr)
cpt1 <- paste(cpt$value, collapse = '|')
setDT(ex)[, i4 := str_extract(code, cpt1)]
ex[, i3 := sapply(cpt1, grepl, code)]
ex[, i2 := any(grepl(cpt1, code))]
-output
ex
code i4 i3 i2
1: 2456 <NA> FALSE TRUE
2: 62267 62267 TRUE TRUE
3: 6200 <NA> FALSE TRUE
4: 62268 62268 TRUE TRUE
5: 63001 <NA> FALSE TRUE
6: 62269 62269 TRUE TRUE
Upvotes: 2