Hacendado
Hacendado

Reputation: 81

how to check in pandas if element is in column which is an array

I am facing this situation let's say I have this dataFrame

data = [{'column_1': 1, 'column_2': ['lala', 'lili', 'lele']},{'column_1': 5, 'column_2':['lala', 'lili', 'lolo']}]
df = pd.DataFrame(data)
df
   column_1            column_2
0         1  [lala, lili, lele]
1         5  [lala, lili, lolo]

and I have 3 strings

string_1 = 'lele'

string_2 = 'lulu'

string_3 = 'lolo'

how could I check if string_1 is in the array of column_2 and return that row?

Upvotes: 2

Views: 120

Answers (2)

ansev
ansev

Reputation: 30920

We could also use:

df[df['column_2'].apply(pd.Series).eq(string_1).any(axis=1)]

Another approach with Series.explode and DataFrame.groupby and Series.any

df[df['column_2'].explode().eq(string_1).any(level = 0)]

Output

   column_1            column_2
0         1  [lala, lili, lele]

Upvotes: 2

jezrael
jezrael

Reputation: 862511

Use boolean indexing with in for check membeship in list:

string_1 = 'lele'

df1 = df[df['column_2'].apply(lambda x: string_1 in x)]
#alternative
#df1 = df[[string_1 in x for x in df['column_2']]]
print (df1)
   column_1            column_2
0         1  [lala, lili, lele]

Another solution is create helper DataFrame and test at least one match per rows by DataFrame.eq and DataFrame.any:

df1 = df[pd.DataFrame(df['column_2'].tolist(), index=df.index).eq(string_1).any(axis=1)]
print (df1)
   column_1            column_2
0         1  [lala, lili, lele]

Upvotes: 3

Related Questions