Reputation: 741
I have the followings:
df = pd.DataFrame({"value":['A','B','B','C','B'],'my_list':[['J1','J4'],['J2','J9','J1'],['J0','J9','J2'],['J2'],['V13','X9','J1']]})
value my_list
0 A [J1, J4]
1 B [J2, J9, J1]
2 B [J0, J9, J2]
3 C [J2]
4 B [V13, X9, J1]
and I wish to find all values B that does not have J2 in it. If using lambda I can do:
df.apply(lambda x: x['value']=='B' and 'J2' not in x['my_list'], axis=1)
0 False
1 False
2 False
3 False
4 True
But I wish to do something like this:
df[(df['value']=='B') & ('J2' not in df['my_list'])]
doable?
Upvotes: 1
Views: 281
Reputation: 75100
Using list, dict etc in a pandas dataframe loses the befinifit of vectorization. If storing these values with these dtypes are mandatory, list comprehension works faster:
df['value'].eq("B")&['J2' not in i for i in df['my_list']]
Other methods:
Converting to dataframe looks like:
df['value'].eq('B') & ~pd.DataFrame(df['my_list'].tolist()).isin(['J2']).any(1)
Or converting to string :
df['value'].eq('B')& ~df['my_list'].astype(str).str.contains("J2")
And:
df['value'].eq("B")&df['my_list'].astype(str).str.count("J2").ne(1)
Upvotes: 2