Marc
Marc

Reputation: 199

Selecting rows with column value in other column list in Python

I have a pandas data frame with the following format:

col1  col2  ...      col4
   A     2        [2-3-4]
   B     3        [2-6]
   A     3        [2-3-4]
   C     2        [2-3-4]
   D     2        [2-3-4]

I would like to select only the rows where the value in col2 is in the list of col4.

I tried to use:

df[(df["col2"].isin(df["col4"].str.split("-"))]

but I get an empty data frame...

Upvotes: 2

Views: 144

Answers (3)

Utsav
Utsav

Reputation: 5918

Code

df['col4'] = df.col4.astype(str).str.replace('-',',')
df['col2'] = df.col2.astype(str)
df= df[df.apply(lambda x: x.col2 in x.col4, axis=1)]

Output

    col1    col2    col4
0   A       2   [2,3,4]
2   A       3   [2,3,4]
3   C       2   [2,3,4]
4   D       2   [2,3,4]

Upvotes: 2

anky
anky

Reputation: 75080

I would use a list comprehension here for this usecase:

df[[str(a) in b for a,b in zip(df['col2'],df['col4'])]]

  col1  col2     col4
0    A     2  [2-3-4]
2    A     3  [2-3-4]
3    C     2  [2-3-4]
4    D     2  [2-3-4]

Or using regex search which will not match 2 with 22 #thanks @Nk03

import re
df[[bool(re.search(fr'\b{a}\b',b)) for a,b in zip(df['col2'],df['col4'])]]

Upvotes: 4

Nk03
Nk03

Reputation: 14949

You can try this :

import ast
df.col4 = df.col4.str.replace('-',',').apply(ast.literal_eval)
new_df = df[df.apply(lambda x: x['col2'] in x['col4'], axis =1)]

Upvotes: 1

Related Questions