Reputation: 41
I have a list of dictionaries
example_list = [{'email':'[email protected]'},{'email':'[email protected]'}]
and a dataframe with an 'Email' column
I need to compare the list against the dataframe and return the values that are not in the dataframe.
I can certainly iterate over the list, check in the dataframe, but I was looking for a more pythonic way, perhaps using list comprehension or perhaps a map function in dataframes?
Upvotes: 3
Views: 157
Reputation: 41
I ended up converting the list into a dataframe, comparing the two dataframes by merging them on a column, and then creating a dataframe out of the missing values
so, for example
example_list = [{'email':'[email protected]'},{'email':'[email protected]'}]
df_two = pd.DataFrame(item for item in example_list)
common = df_one.merge(df_two, on=['Email'])
df_diff = df_one[(~df_one.Email.isin(common.Email))]
Upvotes: 0
Reputation: 164623
One way is to take one set
from another. For a functional solution you can use operator.itemgetter
:
from operator import itemgetter
res = set(map(itemgetter('email'), example_list)) - set(df['email'])
Note -
is syntactic sugar for set.difference
.
Upvotes: 1
Reputation: 402353
To return those values that are not in DataFrame.email, here's a couple of options involving set difference operations—
np.setdiff1d
emails = [d['email'] for d in example_list)]
diff = np.setdiff1d(emails, df['Email']) # returns a list
set.difference
# returns a set
diff = set(d['email'] for d in example_list)).difference(df['Email'])
Upvotes: 1