Reputation: 119
I have problem with a filter in a dataframe, I have several columns that have values separeted by (,). I need filter if one of these values is greater than a 3 (for the first column) and for 8 in the second column (the values are not sorted, and I have NaN in some rows)
Example of df:
data = {'ID': ["1", "2","3","4"],
'Filter1': ['1', '1,3,5','2,1','7,5'],
'Filter2': ['20,5','7,13','8','9,15,18']
}
df = pd.DataFrame (data, columns = ['ID','Filter1','Filter2'])
ID Filter1 Filter2
0 1 1 20;5
1 2 1;3;5 7;13
2 3 2;1 8
3 4 7;5 9;15;18
I think split(',') is useful, but I don't know how to apply this, I think any() is useful too.
I know with df["filter1"].str.split(",") obtained a list, but I don't how filter in the same line, without more complex.
My second idea is split de column, and filter with a regular expresion for the name of the columns, but I but I can't get it to work.
df['Filter1].str.split(',', expand=True)
I need obtain something like this
ID Filter1 Filter2
1 2 1;3;5 7;13
3 4 7;5 9;15;18
Upvotes: 1
Views: 68
Reputation: 323306
Then let us use split
with any
s1 = df.Filter1.str.split(',',expand=True).astype(float).gt(3).any(1)
s2 = df.Filter2.str.split(',',expand=True).astype(float).gt(8).any(1)
newdf = df[s1 & s2]
newdf
Out[36]:
ID Filter1 Filter2
1 2 1,3,5 7,13
3 4 7,5 9,15,18
Upvotes: 1