Reputation: 437
I'd like to match the following
My best email gmail.com
email com
email.com
to become
My best email
email com
*nothing*
Specifically, I'm using Regex for R, so I know there are different rules for escaping certain characters. I'm very new to Regex, but so far I have
\ .*(com)
which makes the same input
My
But this code does not work for instances where there are no spaces like the third example, and removes everything past the first space of a line if the line has a ".com"
Upvotes: 0
Views: 59
Reputation: 627343
Use the following solution:
x <- c("My best email gmail.com","email com", "email.com", "smail.com text here")
trimws(gsub("\\S+\\.com\\b", "", x))
## => [1] "My best email" "email com" "" "text here"
See the R demo.
The \\S+\\.com\\b
pattern matches 1+ non-whitespace chars followed by a literal .com
followed by the word boundary.
The trimws
function will trim all the resulting strings (as, e.g. with "smail.com text here"
, when a space will remain after smail.com
removal).
Note that TRE regex engine does not support shorthand character classes inside bracket expressions.
Upvotes: 5