user1569897
user1569897

Reputation: 437

How to delete only (anyword).com in Regex?

I'd like to match the following

My best email gmail.com
email com
email.com

to become

My best email
email com
*nothing*

Specifically, I'm using Regex for R, so I know there are different rules for escaping certain characters. I'm very new to Regex, but so far I have

\ .*(com) 

which makes the same input

My

But this code does not work for instances where there are no spaces like the third example, and removes everything past the first space of a line if the line has a ".com"

Upvotes: 0

Views: 59

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627343

Use the following solution:

x <- c("My best email gmail.com","email com", "email.com", "smail.com text here")
trimws(gsub("\\S+\\.com\\b", "", x))
## => [1] "My best email" "email com"     ""              "text here"

See the R demo.

The \\S+\\.com\\b pattern matches 1+ non-whitespace chars followed by a literal .com followed by the word boundary.

The trimws function will trim all the resulting strings (as, e.g. with "smail.com text here", when a space will remain after smail.com removal).

Note that TRE regex engine does not support shorthand character classes inside bracket expressions.

Upvotes: 5

Related Questions