Reputation: 1007
I am fetching the rows with some values from a pandas dataframe with the following code. I need to convert this code to pandas.query()
.
results = rs_gp[rs_gp['Col1'].notnull()]
When I convert to:
results = rs_gp.query('Col1!=None')
It gives me the error
None is not defined
Upvotes: 28
Views: 23446
Reputation: 371
I don't know, if my solution was added to pandas after the first answer on this question, but notnull() and isnull() are now valid options for queries in pandas.
df.query('Col1.isnull()', engine='python')
This will return all rows where the value in the cell of the row is null.
df.query('Col1.notnull()', engine='python')
Vice versa, this query will return every row, where the value is not NaN.
In Addition: stating the engine and setting it to python will let you use pandas functions in a query.
Upvotes: 20
Reputation: 210862
We can use the fact that NaN != NaN
:
In [1]: np.nan == np.nan
Out[1]: False
So comparing column to itself will return us only non-NaN values:
rs_gp.query('Col1 == Col1')
Demo:
In [42]: df = pd.DataFrame({'Col1':['aaa', np.nan, 'bbb', None, '', 'ccc']})
In [43]: df
Out[43]:
Col1
0 aaa
1 NaN
2 bbb
3 None
4
5 ccc
In [44]: df.query('Col1 == Col1')
Out[44]:
Col1
0 aaa
2 bbb
4
5 ccc
Upvotes: 39