how to check in pandas if element is in column which is an array

Question

I am facing this situation let's say I have this dataFrame

data = [{'column_1': 1, 'column_2': ['lala', 'lili', 'lele']},{'column_1': 5, 'column_2':['lala', 'lili', 'lolo']}]
df = pd.DataFrame(data)
df
   column_1            column_2
0         1  [lala, lili, lele]
1         5  [lala, lili, lolo]

and I have 3 strings

string_1 = 'lele'

string_2 = 'lulu'

string_3 = 'lolo'

how could I check if string_1 is in the array of column_2 and return that row?

jezrael · Accepted Answer

Use boolean indexing with in for check membeship in list:

string_1 = 'lele'

df1 = df[df['column_2'].apply(lambda x: string_1 in x)]
#alternative
#df1 = df[[string_1 in x for x in df['column_2']]]
print (df1)
   column_1            column_2
0         1  [lala, lili, lele]

Another solution is create helper DataFrame and test at least one match per rows by DataFrame.eq and DataFrame.any:

df1 = df[pd.DataFrame(df['column_2'].tolist(), index=df.index).eq(string_1).any(axis=1)]
print (df1)
   column_1            column_2
0         1  [lala, lili, lele]

how to check in pandas if element is in column which is an array

Answers (2)

Related Questions