NChechulin
NChechulin

Reputation: 17

Polars apply map_elements to all values (including nulls)

I have a column in a dataset which has null (which are to be predicted), and some other ones. I wanted to create an is_null column which says whether the first column's values were null or not (element-wise).

I came across .map_elements method, but it "skipped" the null values. Here's the example:

import polars as pl


df = pl.DataFrame({"foo": [1, None, 3], "bar": [-1, None, 8]})

# shape: (3, 2)
# ┌──────┬──────┐
# │ foo  ┆ bar  │
# │ ---  ┆ ---  │
# │ i64  ┆ i64  │
# ╞══════╪══════╡
# │ 1    ┆ -1   │
# │ null ┆ null │
# │ 3    ┆ 8    │
# └──────┴──────┘

def print_and_fill(value):
    print("Value is", value)
    return 1

df["foo"].map_elements(print_and_fill)

## Output ##
# Value is 1
# Value is 3

# shape: (3,)
# Series: 'bar' [i64]
# [
#   1
#   null
#   1
# ]

Clearly, the null value was skipped. Is there any way to apply the function to all values?

I came across a workaround: We can temporarily .fill_null() and then call .map_elements(), but this is clearly not the best solution.

Upvotes: 2

Views: 1531

Answers (1)

user18559875
user18559875

Reputation:

map_elements has a skip_nulls= parameter which defaults to True.

In general, it's best to avoid using map_elements unless absolutely necessary.

I wanted to create an is_null column which says whether the first column's values were null or not (element-wise).

One easy way is to use the is_null expression. For example:

(
    df
    .with_columns(
        pl.col('foo').is_null().alias('foo_is_null')
    )
)
shape: (3, 3)
┌──────┬──────┬─────────────┐
│ foo  ┆ bar  ┆ foo_is_null │
│ ---  ┆ ---  ┆ ---         │
│ i64  ┆ i64  ┆ bool        │
╞══════╪══════╪═════════════╡
│ 1    ┆ -1   ┆ false       │
│ null ┆ null ┆ true        │
│ 3    ┆ 8    ┆ false       │
└──────┴──────┴─────────────┘

Upvotes: 2

Related Questions