Reputation: 81
I am facing this situation let's say I have this dataFrame
data = [{'column_1': 1, 'column_2': ['lala', 'lili', 'lele']},{'column_1': 5, 'column_2':['lala', 'lili', 'lolo']}]
df = pd.DataFrame(data)
df
column_1 column_2
0 1 [lala, lili, lele]
1 5 [lala, lili, lolo]
and I have 3 strings
string_1 = 'lele'
string_2 = 'lulu'
string_3 = 'lolo'
how could I check if string_1 is in the array of column_2 and return that row?
Upvotes: 2
Views: 120
Reputation: 30920
We could also use:
df[df['column_2'].apply(pd.Series).eq(string_1).any(axis=1)]
Another approach with Series.explode
and DataFrame.groupby
and Series.any
df[df['column_2'].explode().eq(string_1).any(level = 0)]
Output
column_1 column_2
0 1 [lala, lili, lele]
Upvotes: 2
Reputation: 862511
Use boolean indexing
with in
for check membeship in list:
string_1 = 'lele'
df1 = df[df['column_2'].apply(lambda x: string_1 in x)]
#alternative
#df1 = df[[string_1 in x for x in df['column_2']]]
print (df1)
column_1 column_2
0 1 [lala, lili, lele]
Another solution is create helper DataFrame and test at least one match per rows by DataFrame.eq
and DataFrame.any
:
df1 = df[pd.DataFrame(df['column_2'].tolist(), index=df.index).eq(string_1).any(axis=1)]
print (df1)
column_1 column_2
0 1 [lala, lili, lele]
Upvotes: 3