Reputation: 2217
I am trying to get if a Dataframe contains Null or NaN values the following way:
import numpy as np
import pandas as pd
# load the time series of known points
y = [np.array([])]
y = [np.array([10, 11, 11, 11, 2, 4, 3, 7, 8, 9]),
np.array([1, 1, 1, 2, 2, 2, 2, 3, 2, 1]),
np.array([1, 1, 2, 2, 2, 2, 3, 2, 1])]
y = pd.DataFrame(y)
for i in range(len(y)):
if (pd.isnull(y.any) == True):
print ("Error: t2 array index " + str(i) + " ahave NaN or null!")
print("All good bro")
print (pd.isnull(y)
print (pd.isnull(y.any))
When I print y you can clearly see that the last element is NaN. This is due to the third numpy array being shorter than the others. Pandas automatically fill the missing value with NaN to keep the dataframe shape.
However if I try to print pd.isnull(y.any)
I get False.
I tried y.all
instead and get also False.
I also tried y[1].any
with the same outcome.
What am I missing here ?
Upvotes: 1
Views: 1200
Reputation: 5334
you can use:
y.isnull().values.any()
code:
import numpy as np
import pandas as pd
# load the time series of known points
y = [np.array([])]
y = [np.array([10, 11, 11, 11, 2, 4, 3, 7, 8, 9]),
np.array([1, 1, 1, 2, 2, 2, 2, 3, 2, 1]),
np.array([1, 1, 2, 2, 2, 2, 3, 2, 1])]
y = pd.DataFrame(y)
print(y.isnull().values.any())
output:
True
Upvotes: 2