Reputation: 25
Newbie here and couldn't find an answer to my question. I have string observations in my string variable and try to detect MS OR MA OR Master but exclude MBA:
input <- c("Master of Business Administration (MBA) program", "MS, MA, Master", "Master")
desired output with str_detect
:
False, True, True
Edit: this worked for me now:
str_detect(input, "\\bMS\\b|\\bMaster\\b|\\bMA\\b") & !str_detect(input,"\\bMBA\\b")
Upvotes: 1
Views: 1627
Reputation: 626845
You may use a single PCRE pattern (you need to use grepl
with perl=TRUE
):
> grepl('^(?!.*\\bMBA\\b).*\\b(?:Master|MA)\\b', input, perl=TRUE)
[1] FALSE TRUE TRUE
See the regex demo. NOTE that you may use the same pattern with str_detect
:
> str_detect(input, '^(?!.*\\bMBA\\b).*\\b(?:Master|MA)\\b')
[1] FALSE TRUE TRUE
Details
^
- start of string(?!.*\\bMBA\\b)
- a negative lookahead that fails the match if there is a whole word MBA
after any 0+ chars other than line break chars from the start of the string (add (?s)
at the pattern start to enable multiple line input) .*
- any 0+ chars other than line break chars, as many as possible\\b(?:Master|MA)\\b
- a whole word Master
or MA
.Upvotes: 3
Reputation: 2864
You can combine your logical conditions:
library(stringr)
input <- c("Master of Business Administration (MBA) program", "MS, MA, Master", "Master")
(str_detect(input, "Master") & !str_detect(input, "MBA"))
# [1] FALSE TRUE TRUE
Upvotes: 1