Reputation: 3046
I have a list of Data Frames in which some Data Frames have NaN
values. So far I can identify NaN
values for a single Data Frame using this link.
How can I find the index of the list where a Data Frame has NaN
values.
Sample list of dffs
,
[
var1 var1
14.171250 13.593813
13.578317 13.595329
10.301850 13.580139
9.930217 NaN
6.192517 13.561943
NaN 13.565149
6.197983 13.572509,
var1 var2
2.456183 5.907528
5.052017 5.955731
5.960000 5.972480
8.039317 5.984608
7.559217 5.985348
6.933633 5.979438,
var1 var1
14.171250 23.593813
23.578317 23.595329
56.301850 23.580139
90.930217 22.365676
89.192517 33.561943
86.23654 53.565149
NaN 13.572509,
...]
I need to get the results in a list indexes
0
and 2
that have NaN
values.
So far I tried this,
df_with_nan = []
for df in dffs:
df_with_nan.append(df.columns[df.isnull().any()])
Per above for
loop I get column names, var1
and var2
. However, I need the indexes of those Data Frames when i loop through it. Any help or suggestion would be great.
Upvotes: 4
Views: 1751
Reputation: 402814
You're almost there... just use enumerate
to loop with indices, and df.isnull().values.any()
(faster than df.isnull().any().max()
) to test:
df_with_nan = []
for i, df in enumerate(dffs):
if df.isnull().values.any():
df_with_nan.append(i)
Granted, a list comp is shorter, but go for whatever you prefer.
Upvotes: 1
Reputation: 109626
You can use a conditional list comprehension to enumerate over all dataframes in your list and return the enumerated index value of those that contain any null values.
df_with_nan = [n for n, df in enumerate(dffs) if sum(df.isnull().any())]
Upvotes: 2