Reputation:
I have regex string data but would like to exclude a substring
dat <- c('long_regex_other_stuff','long_regex_other_random.something')
(dat[grep('long_regex',dat)])
(dat[grep('long_regex.*(?!.*something$)',dat)])
The first grep output is expected
"long_regex_other_stuff" "long_regex_other_random.something"
How to get the second grep to work? The desired output is
"long_regex_other_stuff"
Ref: Regular expression to match a line that doesn't contain a word?
Upvotes: 2
Views: 131
Reputation: 174816
You need to remove the preceding .*
before the string something
in the regex and add it after the negative lookahead,
> dat <- c('long_regex','long_regex.something')
> (dat[grep('long_regex(?!.*something).*',dat, perl=T)])
[1] "long_regex"
> (dat[grep('long_regex(?!.*\\bsomething\\b).*',dat, perl=T)])
[1] "long_regex"
long_regex(?!.*something)
negative lookahead present in this regex asserts that there isn't a string something
present after to the substring long_regex
.
> dat <- c('long_regex_other_stuff','long_regex_other_random.something')
> (dat[grep('long_regex(?!.*\\bsomething\\b).*',dat, perl=T)])
[1] "long_regex_other_stuff"
Upvotes: 1