isnull(df.any) is returning False despite NaN in the DataFrame

Question

I am trying to get if a Dataframe contains Null or NaN values the following way:

import numpy as np
import pandas as pd

# load the time series of known points
y = [np.array([])]

y = [np.array([10, 11, 11, 11, 2, 4, 3, 7, 8, 9]),  
     np.array([1, 1, 1, 2, 2, 2, 2, 3, 2, 1]), 
     np.array([1, 1, 2, 2, 2, 2, 3, 2, 1])]
y = pd.DataFrame(y)



for i in range(len(y)):
    if (pd.isnull(y.any) == True):
        print ("Error: t2 array index " + str(i) + " ahave NaN or null!")
print("All good bro")
print (pd.isnull(y)
print (pd.isnull(y.any))

When I print y you can clearly see that the last element is NaN. This is due to the third numpy array being shorter than the others. Pandas automatically fill the missing value with NaN to keep the dataframe shape.

However if I try to print pd.isnull(y.any) I get False. I tried y.all instead and get also False. I also tried y[1].any with the same outcome.

What am I missing here ?

Nihal · Accepted Answer

you can use:

y.isnull().values.any()

code:

import numpy as np
import pandas as pd

# load the time series of known points
y = [np.array([])]

y = [np.array([10, 11, 11, 11, 2, 4, 3, 7, 8, 9]),  
     np.array([1, 1, 1, 2, 2, 2, 2, 3, 2, 1]), 
     np.array([1, 1, 2, 2, 2, 2, 3, 2, 1])]
y = pd.DataFrame(y)

print(y.isnull().values.any())

output:

True

isnull(df.any) is returning False despite NaN in the DataFrame

Answers (1)

Related Questions