Reputation: 1335
I know I can do .null_count()
in Polars, which returns a dataframe telling me null count for each column.
d = {"foo":[1,2,3, None], "bar":[4,None, None, 6]}
df_polars_withnull = pl.from_dict(d)
df_polars_withnull.null_count()
would yield a dataframe
foo bar
1 2
I want to know if there's any nulls, within the entire dataframe
something like
if any(df_polars_withnull.null_count()):
print ('has nulls')
else:
print ('no nulls')
Unfortunately, that doesn't work. What is the correct code here?
This works, but seems a bit ugly
if df_polars_nonull.null_count().sum(axis=1)[0]:
print ('has nulls')
else:
print ('no nulls')
Upvotes: 5
Views: 5298
Reputation: 262224
What about reshaping to a single column with unpivot
, then running is_null
+any
:
df_polars_withnull.unpivot()['value'].is_null().any()
Or:
any(df_polars_withnull.select(pl.all().is_null().any()).row(0))
Output: True
Upvotes: 1
Reputation: 10464
I would consider the following approaches idiomatic polars. Note that in both cases, I use pl.DataFrame.item
to ensure the output is a bool
instead of a pl.DataFrame
of shape (1, 1).
df.null_count().pipe(sum).item() > 0
df.select(pl.any_horizontal(pl.all().is_null().any())).item()
Explanation. If df
is of shape (n, c), then:
pl.all().is_null()
will give a boolean dataframe of shape (n, c) indicating for each element whether it is null;pl.all().is_null().any()
will give a boolean dataframe of shape (1, c) indicating for each column whether it contains at least one null;pl.any_horizontal(pl.all().is_null().any())
will give a boolean dataframe of shape (1, 1) indicating whether there were any null values in df
.Upvotes: 6
Reputation: 4827
Maybe this is what you are looking for:
df_polars_withnull.null_count().sum_horizontal()
Upvotes: 2