Reputation: 6755
I have this behaviour of regex which I can't explain. My goal is to parse only the text after the @
yet when my string contains \n
preceded by some words, gsub
parses also \n
:
string <- ".@address something \n"
gsub("^\\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\\1", string, perl=T);
# [1] "address\n"
string <- ".@address \n"
gsub("^\\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\\1", string, perl=T);
# [1] "address"
Upvotes: 2
Views: 578
Reputation: 887391
To extract address
, you could also use:
library(stringr)
str_extract(string, perl('(?<=@)[a-z0-9_]+(?= )'))
#[1] "address"
Upvotes: 0
Reputation: 81703
In Perl-compatible regular expressions .
does not match \n
. This is in contrast to "normal" regular expressions. Have a look at this example:
grepl(".", "\n", perl = FALSE)
# [1] TRUE
grepl(".", "\n", perl = TRUE)
# [1] FALSE
Your code will work if you specify perl = FALSE
:
gsub("^\\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\\1", string, perl = FALSE)
# [1] "address"
Upvotes: 3