Reputation: 2600
My question is regarding a Pandas DataFrame and a list of e-mail addresses. The simplified dataframe (called 'df') looks like this:
Name Address Email
0 Bush Apple Street
1 Volt Orange Street
2 Smith Kiwi Street
The simplified list of e-mail addresses looks like this:
list_of_emails = ['[email protected]', '[email protected]', '[email protected]']
Is it possible to loop through the dataframe, to check if a last name is (part of) a e-mail address AND then add that email address to the dataframe? The following code does not work unfortunately, because of line 2 I think:
for index, row in df.iterrows():
if row['Name'] in x for x in list_of_emails:
df['Email'][index] = x
Your help is very much appreciated!
Upvotes: 0
Views: 4344
Reputation: 77027
Here's one way using apply
and lambda function
For, first match
In [450]: df.Name.apply(
lambda x: next((e for e in list_of_emails if x.lower() in e), None))
Out[450]:
0 [email protected]
1 [email protected]
2 [email protected]
Name: Name, dtype: object
For all matches, in a list
In [451]: df.Name.apply(lambda x: [e for e in list_of_emails if x.lower() in e])
Out[451]:
0 [[email protected]]
1 [[email protected]]
2 [[email protected]]
Name: Name, dtype: object
Upvotes: 3
Reputation: 81684
Generally you should consider using iterrows
as last resort only.
Consider this:
import pandas as pd
df = pd.DataFrame({'Name': ['Smith', 'Volt', 'Bush']})
list_of_emails = ['[email protected]', '[email protected]', '[email protected]']
def foo(name):
for email in list_of_emails:
if name.lower() in email:
return email
df['Email'] = df['Name'].apply(foo)
print(df)
# Name Email
# 0 Smith [email protected]
# 1 Volt [email protected]
# 2 Bush [email protected]
Upvotes: 4