Rtut
Rtut

Reputation: 1007

Pandas.dataframe.query() - fetch not null rows (Pandas equivalent to SQL: "IS NOT NULL")

I am fetching the rows with some values from a pandas dataframe with the following code. I need to convert this code to pandas.query().

results = rs_gp[rs_gp['Col1'].notnull()]

When I convert to:

results = rs_gp.query('Col1!=None')

It gives me the error

None is not defined

Upvotes: 28

Views: 23446

Answers (2)

Phil
Phil

Reputation: 371

I don't know, if my solution was added to pandas after the first answer on this question, but notnull() and isnull() are now valid options for queries in pandas.

df.query('Col1.isnull()', engine='python')

This will return all rows where the value in the cell of the row is null.

df.query('Col1.notnull()', engine='python')

Vice versa, this query will return every row, where the value is not NaN.

In Addition: stating the engine and setting it to python will let you use pandas functions in a query.

Upvotes: 20

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210862

We can use the fact that NaN != NaN:

In [1]: np.nan == np.nan
Out[1]: False

So comparing column to itself will return us only non-NaN values:

rs_gp.query('Col1 == Col1')

Demo:

In [42]: df = pd.DataFrame({'Col1':['aaa', np.nan, 'bbb', None, '', 'ccc']})

In [43]: df
Out[43]:
   Col1
0   aaa
1   NaN
2   bbb
3  None
4
5   ccc

In [44]: df.query('Col1 == Col1')
Out[44]:
  Col1
0  aaa
2  bbb
4
5  ccc

Upvotes: 39

Related Questions