Reputation: 931

how to filter rows based on

I have a dataframe in pandas where one of the column (i.e., column 'b') contains strings with $ symbols:

import numpy as np 
import pandas as pd 

df = pd.DataFrame({'a': [51, 2,32,99,81], 'b': ['$3', '$4','$-','$0','$23']})

I want to filter the dataframe such that I only retain the rows where column'b' only returns integers other then zero and the $ symbol is discarded.

My desired output is:

Any feedback is welcome.

Upvotes: 1

Answers (3)

MaxU - stand with Ukraine

Reputation: 210842

In [64]: df = pd.DataFrame({'a': [51, 2,32,99,81], 'b': ['$3', '$4','$-','$0','$23']})

In [65]: df['b'] = pd.to_numeric(df['b'].str.replace(r'\D+', ''), errors='coerce')

In [67]: df
Out[67]:
    a     b
0  51   3.0
1   2   4.0
2  32   NaN
3  99   0.0
4  81  23.0

In [68]: df = df[df['b'].notnull() & df['b'].ne(0)]

In [69]: df
Out[69]:
    a     b
0  51   3.0
1   2   4.0
4  81  23.0

alternatively we can filter it this way:

In [73]: df = df.query("b == b and b != 0")

In [74]: df
Out[74]:
    a     b
0  51   3.0
1   2   4.0
4  81  23.0

Upvotes: 6

cs95

Reputation: 402483

Similarly, using pd.to_numeric (assuming your data has this same structure throughout).

df.b = pd.to_numeric(df.b.str[1:], errors='coerce')
print(df)
    a     b
0  51   3.0
1   2   4.0
2  32   NaN
3  99   0.0
4  81  23.0

print (df.dropna(subset=['b']))
    a     b
0  51   3.0
1   2   4.0
3  99   0.0
4  81  23.0

If you want to filter out both NaNs and zeros, use:

print (df[df.b.notnull() & df.b.ne(0)])
    a     b
0  51   3.0
1   2   4.0
4  81  23.0

Upvotes: 4

BENY

Reputation: 323226

df.b=pd.to_numeric(df['b'].str.replace('$', ''),errors='coerce')
df
Out[603]: 
    a     b
0  51   3.0
1   2   4.0
2  32   NaN
3  99   0.0
4  81  23.0

df.loc[(df.b.notnull())&(df.b!=0),:]
Out[604]: 
    a     b
0  51   3.0
1   2   4.0
4  81  23.0

Upvotes: 4

how to filter rows based on

Answers (3)

Related Questions