Reputation: 1017
I'm tring to extract email adressess from a content. I've a problem about false positives.
My regex for: [email protected]
[^\.^\w+](\w+) *?@ *?(\w+) *?(?:\.|dot) *?(\w+)
Regex for: [email protected]
[^\.^\w+](\w+) *?@ *?(\w+) *?(?:\.|dot) *?(\w+) *?(?:\.|dot) *?(\w+)
I want the first regex not to match with: [email protected]
How can I fix it?
Upvotes: 0
Views: 123
Reputation: 450
The only way to distinguish [email protected] and [email protected] is to maintain a list of valid top level domains (yes, I'm sorry).
i.e, replacing your last (\w+)
by (com|org|info|ly|...
and so on.
Also, you could do only one regex.
Also, my address could be [email protected], be careful...
Upvotes: 1