user1947085
user1947085

Reputation: 177

Python list index out of range error

import pandas as pd
df = pd.read_csv("email_addresses_of_ALL_purchasers.csv")
all_emails = df["Email"]
real_emails = [] 

test_domains = ['yahoo.com', 'gmail.com', 'facebook.com', 'hotmail.com']

for email in all_emails: 
    email_separated = email.split("@")
    if email_separated[1] not in test_domains:
        real_emails.append(email) 
print real_emails

I'm trying to filter out different email account types. Why does this above code produce an error:

IndexError: list index out of range

Upvotes: 0

Views: 235

Answers (3)

John La Rooy
John La Rooy

Reputation: 304137

More robust to use partition here. If the @ is missing - domain will simply be the empty string

for email in all_emails: 
    name, delim, domain = email.partition("@")
    if domain and domain not in test_domains:

Also wikipedia has a list of unusual but valid email address examples that may surprise you

Upvotes: 0

Nikolai Tschacher
Nikolai Tschacher

Reputation: 1579

Try this:

import pandas as pd
df = pd.read_csv("email_addresses_of_ALL_purchasers.csv")
all_emails = df["Email"]
real_emails = [] 

test_domains = ['yahoo.com', 'gmail.com', 'facebook.com', 'hotmail.com']

for email in all_emails: 
    email_separated = email.split("@")
    try:
        if email_separated[1] not in test_domains:
            real_emails.append(email)
    except IndexError:
        print('Mail {} does not contain a @ sign'.format(email))
print real_emails

Upvotes: 2

Hyperboreus
Hyperboreus

Reputation: 32429

Apparently one of your emails does not contain a @.

Put a print(email) as first statement of the loop, then you can see which email doesn't fit.

Upvotes: 6

Related Questions