Reputation: 3
I am trying to write a regex to remove gmail subaddressing, i.e.: strip dots and +whatever from gmail addresses.
for example:
[email protected] -> [email protected]
I can do it with excel, but I am trying to do it only with regex, i.e. match all the characters that should be removed.
=concat(replace(REGEXREPLACE(left(A1,find("@",A1)-1),"\.",""),find("+",REGEXREPLACE(left(A1,find("@",A1)-1),"\.","")),len(REGEXREPLACE(left(A1,find("@",A1)),"\.",""))-find("+",REGEXREPLACE(left(A1,find("@",A1)-1),"\.","")),""),right(A1,len(A1)-find("@",A1)+1))
The domain part should stay intact.
Upvotes: 0
Views: 575
Reputation: 2587
Here's my try on this: (\.(?=[^@]*?@)|\+[^@]*?(?=@))
You can see a working demo here.
The expression matches everything you want to remove. It uses lookaheads with arbitrary quantifiers - i hope the regex engine you are using supports this.
Maybe there's also a solution without those, but that's what i can offer for now.
EDIT: I changed the expression so it's not using the evil .*?
anymore.
Upvotes: 1