Altons
Altons

Reputation: 1424

Exclude specific email address with regex

I have this regex for extracting emails which works fine:

([a-zA-Z][\w\.-]*[a-zA-Z0-9])@([a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z])

however there are some e-mails I don't want to include like:

[email protected]
[email protected]
[email protected]

I've been trying to add things like ^(?!server|noreplay|name) but isn't no working.

Also by using parentheses as above will afect tuples with (name, domain) ?

Upvotes: 0

Views: 1412

Answers (2)

yurisich
yurisich

Reputation: 7109

Check the results from your regex for any emails that match the bad emails list.

results = list_from_your_regex
invalids = ['info', 'server', 'noreply', ...]
valid_emails = [good for good in results if good.split('@')[0] not in invalids]

Upvotes: 0

CoffeeRain
CoffeeRain

Reputation: 4522

Just check for those email addresses after you extract them...

bad_addresses=['[email protected]', '[email protected]', '[email protected]']
emails=re.findall('[a-zA-Z][\w\.-]*[a-zA-Z0-9])@([a-zA-Z0-9][\w\.-]*[a-zA-Z0-9]\.[a-zA-Z][a-zA-Z\.]*[a-zA-Z]', contentwithemails)

for item in emails[:]:
  if item in bad_addresses:
    emails.remove(item)

You have to do a slice of emails ( emails[:] ), because you can't do a for loop on a list that keeps changing size. This creates a "ghost" list that can be read while the real list is acted on.

Upvotes: 1

Related Questions