Reputation: 177
import pandas as pd
df = pd.read_csv("email_addresses_of_ALL_purchasers.csv")
all_emails = df["Email"]
real_emails = []
test_domains = ['yahoo.com', 'gmail.com', 'facebook.com', 'hotmail.com']
for email in all_emails:
email_separated = email.split("@")
if email_separated[1] not in test_domains:
real_emails.append(email)
print real_emails
I'm trying to filter out different email account types. Why does this above code produce an error:
IndexError: list index out of range
Upvotes: 0
Views: 235
Reputation: 304137
More robust to use partition
here. If the @
is missing - domain
will simply be the empty string
for email in all_emails:
name, delim, domain = email.partition("@")
if domain and domain not in test_domains:
Also wikipedia has a list of unusual but valid email address examples that may surprise you
Upvotes: 0
Reputation: 1579
Try this:
import pandas as pd
df = pd.read_csv("email_addresses_of_ALL_purchasers.csv")
all_emails = df["Email"]
real_emails = []
test_domains = ['yahoo.com', 'gmail.com', 'facebook.com', 'hotmail.com']
for email in all_emails:
email_separated = email.split("@")
try:
if email_separated[1] not in test_domains:
real_emails.append(email)
except IndexError:
print('Mail {} does not contain a @ sign'.format(email))
print real_emails
Upvotes: 2
Reputation: 32429
Apparently one of your emails does not contain a @.
Put a print(email)
as first statement of the loop, then you can see which email doesn't fit.
Upvotes: 6