gsub returns
(newline)

Question

I have this behaviour of regex which I can't explain. My goal is to parse only the text after the @ yet when my string contains preceded by some words, gsub parses also :

string <- ".@address something 
"
gsub("^\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\1", string, perl=T);
# [1] "address
"
string <- ".@address 
"
gsub("^\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\1", string, perl=T);
# [1] "address"

Sven Hohenstein · Accepted Answer

In Perl-compatible regular expressions . does not match . This is in contrast to "normal" regular expressions. Have a look at this example:

grepl(".", "
", perl = FALSE)
# [1] TRUE
grepl(".", "
", perl = TRUE)
# [1] FALSE

Your code will work if you specify perl = FALSE:

gsub("^\.?@([a-z0-9_]{1,15})[^a-z0-9_]+.*$", "\1", string, perl = FALSE)
# [1] "address"

gsub returns \n (newline)

Answers (2)

Related Questions