MetaStack
MetaStack

Reputation: 3696

Empty DataFrame doesn't admit its empty

I must not understand something about emptiness when it comes to pandas DataFrames. I have a DF with empty rows but when I isolate one of these rows its not empty.

Here I've made a dataframe:

>>> df = pandas.DataFrame(columns=[1,2,3], data=[[1,2,3],[1,None,3],[None, None, None],[3,2,1],[4,5,6],[None,None,None],[None,None,None]])
>>> df
     1    2    3
0  1.0  2.0  3.0
1  1.0  NaN  3.0
2  NaN  NaN  NaN
3  3.0  2.0  1.0
4  4.0  5.0  6.0
5  NaN  NaN  NaN
6  NaN  NaN  NaN

Then I know row '2' is full of nothing so I check for that...

>>> df[2:3].empty
    False

Odd. So I split it out into its own dataframe:

>>> df1 = df[2:3]
>>> df1
    1   2   3
2 NaN NaN NaN

>>> df1.empty
False

How do I check for emptiness (all the elements in a row being None or NaN?)

http://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.DataFrame.empty.html

Upvotes: 0

Views: 923

Answers (6)

Saghe Achraf
Saghe Achraf

Reputation: 330

I guess you have to use isnull() instead of empty().

import pandas 
df = pandas.DataFrame(columns=[1,2,3], data=[[1,2,3],[1,None,3],[None, None, None],[3,2,1],[4,5,6],[None,None,None],[None,None,None]])
df[2:3].isnull()
1   2   3
True    True    True

Upvotes: 1

Harikrishna
Harikrishna

Reputation: 1140

If you have a dataframe and want to drop all rows containing NaN in each of the columns, you can do this

df.dropna(how='all')

Noticed that your dataframe also has NaN in one the columns in some cases. If you need to drop the entire row in such case:

df.dropna(how='any')

After you do this (which ever is your preference) you could check length of dataframe (number of rows it contains) using:

len(df)

Upvotes: 1

BENY
BENY

Reputation: 323226

If you are do not want to count NaN value as real number , this will equal to

df.dropna().iloc[5:]

You select the line did not exist in your dataframe

df.dropna().iloc[5:].empty
Out[921]: True

Upvotes: 1

Alexander
Alexander

Reputation: 109528

You can drop all null values from your selection and check if the result is empty:

>>> df[5:].dropna(how='all').empty
True

Upvotes: 2

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

I guess you are looking for something like this:

In [296]: df[5:]
Out[296]:
    1   2   3
5 NaN NaN NaN
6 NaN NaN NaN

In [297]: df[5:].isnull().all(1).all()
Out[297]: True

or even better (as proposed by @IanS):

In [300]: df[5:].isnull().all().all()
Out[300]: True

Upvotes: 2

cs95
cs95

Reputation: 402283

You're misunderstanding what empty is for. It's meant to check that the size of a series/dataframe is greater than 0, meaning there are rows. For example,

df.iloc[1:0]

Empty DataFrame
Columns: [1, 2, 3]
Index: []

df.iloc[1:0].empty
True

If you want to check that a row has all NaNs, use isnull + all:

df.isnull().all(1)

0    False
1    False
2     True
3    False
4    False
5     True
6     True
dtype: bool

For your example, this should do:

df[2:3].isnull().all(1).item()
True

Note that you can't use item if your slice is more than one row in size.

Upvotes: 2

Related Questions