user3240688
user3240688

Reputation: 1335

Polars - check for null in dataframe

I know I can do .null_count() in Polars, which returns a dataframe telling me null count for each column.

d = {"foo":[1,2,3, None], "bar":[4,None, None, 6]}
df_polars_withnull = pl.from_dict(d)

df_polars_withnull.null_count()

would yield a dataframe

foo   bar
  1     2

I want to know if there's any nulls, within the entire dataframe

something like

if any(df_polars_withnull.null_count()):
   print ('has nulls')
else:
   print ('no nulls')

Unfortunately, that doesn't work. What is the correct code here?

This works, but seems a bit ugly

if df_polars_nonull.null_count().sum(axis=1)[0]:
   print ('has nulls')
else:
   print ('no nulls')

Upvotes: 5

Views: 5298

Answers (3)

mozway
mozway

Reputation: 262224

What about reshaping to a single column with unpivot, then running is_null+any:

df_polars_withnull.unpivot()['value'].is_null().any()

Or:

any(df_polars_withnull.select(pl.all().is_null().any()).row(0))

Output: True

Upvotes: 1

Hericks
Hericks

Reputation: 10464

I would consider the following approaches idiomatic polars. Note that in both cases, I use pl.DataFrame.item to ensure the output is a bool instead of a pl.DataFrame of shape (1, 1).

Approach 1. Check if sum of null counts are greater than 0.

df.null_count().pipe(sum).item() > 0

Approach 2. Literally check if there exist any null values.

df.select(pl.any_horizontal(pl.all().is_null().any())).item()

Explanation. If df is of shape (n, c), then:

  • using pl.all().is_null() will give a boolean dataframe of shape (n, c) indicating for each element whether it is null;
  • using pl.all().is_null().any() will give a boolean dataframe of shape (1, c) indicating for each column whether it contains at least one null;
  • using pl.any_horizontal(pl.all().is_null().any()) will give a boolean dataframe of shape (1, 1) indicating whether there were any null values in df.

Upvotes: 6

René
René

Reputation: 4827

Maybe this is what you are looking for:

df_polars_withnull.null_count().sum_horizontal()

Upvotes: 2

Related Questions