JocHwo
JocHwo

Reputation: 119

Filter rows in pandas, column with delimitor

I have problem with a filter in a dataframe, I have several columns that have values separeted by (,). I need filter if one of these values is greater than a 3 (for the first column) and for 8 in the second column (the values are not sorted, and I have NaN in some rows)

Example of df:

data = {'ID':  ["1", "2","3","4"],
        'Filter1': ['1', '1,3,5','2,1','7,5'],
        'Filter2': ['20,5','7,13','8','9,15,18']
        }
df = pd.DataFrame (data, columns = ['ID','Filter1','Filter2'])

    ID  Filter1   Filter2
0   1   1         20;5
1   2   1;3;5     7;13
2   3   2;1       8
3   4   7;5       9;15;18

I think split(',') is useful, but I don't know how to apply this, I think any() is useful too.

I know with df["filter1"].str.split(",") obtained a list, but I don't how filter in the same line, without more complex.

My second idea is split de column, and filter with a regular expresion for the name of the columns, but I but I can't get it to work.

df['Filter1].str.split(',', expand=True) 

I need obtain something like this

        ID  Filter1   Filter2
   1    2   1;3;5     7;13
   3    4   7;5       9;15;18

Upvotes: 1

Views: 68

Answers (1)

BENY
BENY

Reputation: 323306

Then let us use split with any

s1 = df.Filter1.str.split(',',expand=True).astype(float).gt(3).any(1)
s2 = df.Filter2.str.split(',',expand=True).astype(float).gt(8).any(1)
newdf = df[s1 & s2]
newdf
Out[36]: 
  ID Filter1  Filter2
1  2   1,3,5     7,13
3  4     7,5  9,15,18

Upvotes: 1

Related Questions