Reputation: 37
I'm trying to check to see if the value in a DataFrame column is contained in a series in a separate column. I'm receiving the "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
I've researched this, but do not quite understand why I'm receiving this error in this specific instance.
I've tried using both the .contains functions.
A simplified version of the DataFrame structure is as follows:
df
index id id_list in_series (desired return column]
1 23 [1,2,34,56,75] False
2 14 [1,5,14,23,45] True
3 2 [1,2,4,25,37] True
4 14 [2,4,34,26,77] False
5 27 [1,6,19,27,50] True
a = df['id']
b = df['id_list]
df['in_series'] = b.str.contains(a, regex=False)
Is there a better way of going about this?
Upvotes: 1
Views: 98
Reputation: 2036
you still can use a loop
id_list=[[1,2,34,56,75],[1,5,14,23,45],[1,2,4,25,37],[2,4,34,26,77],[1,6,19,27,50]]
id=[23,14,2,14,27]
df=pd.DataFrame([id,id_list]).T
df.columns=["id","id_list"]
boo=list()
for i in range(len(df)):
boo.append(df.iloc[i,0] in df.iloc[i,1])
df["in_series (desired return column]"]=boo
in this case you don't change the type of your data
Upvotes: 1
Reputation: 42886
One of the few cases we can use apply
to check presence of id
in id_list
:
df['in_series'] = df.apply(lambda x: str(x['id']) in ', '.join(str(y) for y in x['id_list']),axis=1)
id id_list in_series
0 23 [1, 2, 34, 56, 75] False
1 14 [1, 5, 14, 23, 45] True
2 2 [1, 2, 4, 25, 37] True
3 14 [2, 4, 34, 26, 77] False
4 27 [1, 6, 19, 27, 50] True
Upvotes: 1
Reputation: 23099
a little list comprehension magic should work :
df['in_series (desired return column'] = ([df.id[i].astype(str) in df.id_list[i]
for i in range(len(df))])
print(df)
index id id_list in_series (desired return column)
0 1 23 [1,2,34,56,75] False
1 2 14 [1,5,14,23,45] True
2 3 2 [1,2,4,25,37] True
3 4 14 [2,4,34,26,77] False
4 5 27 [1,6,19,27,50] True
Upvotes: 0