Reputation: 19
i'm asked to write regular expression which can catch multi-domain email addresses and implement it in python. so i came up with the following regular expression (and code;the emphasis is on the regex though), which i think is correct:
import re
regex = r'\b[\w|\.|-]+@([\w]+\.)+\w{2,4}\b'
input_string = "hey my mail is [email protected]"
match=re.findall(regex,input_string)
print match
now when i run this (using a very simple mail) it doesn't catch it!! instead it shows an empty list as the output. can somebody tell me where did i go wrong in the regular expression literal?
Upvotes: 1
Views: 4985
Reputation: 59571
Here's a simple one to start you off with
regex = r'\b[\w.-]+?@\w+?\.\w+?\b'
re.findall(regex,input_string) # ['[email protected]']
The problem with your original one is that you don't need the |
operator inside a character class ([..]
). Just write [\w|\.|-]
as [\w.-]
(If the -
is at the end, you don't need to escape it).
Next there are way too many variations on legitimate domain names. Just look for at least one period surrounded by word characters after the @
symbol:
@\w+?\.\w+?\b
Upvotes: 1