Nyssance
Nyssance

Reputation: 367

How to convert float Columns without decimal to Int in Polars?

pandas code like this, If a float column with 1.0, 2.0, 3.0, remove all the .0

df = pd.DataFrame({
    "date": ["2025-01-01", "2025-01-02"],
    "a": [1.0, 2.0],
    "c": [1.0, 2.1],
})
print(df)
columns = df.columns.difference(["date"])
df[columns] = df[columns].map(lambda x: int(x) if x.is_integer() else x)
print(df)
         date    a    c
0  2025-01-01  1.0  1.0
1  2025-01-02  2.0  2.1
         date  a    c
0  2025-01-01  1  1.0
1  2025-01-02  2  2.1

Upvotes: 0

Views: 76

Answers (1)

Henry Harbeck
Henry Harbeck

Reputation: 1398

Something like this does the trick.

Note that it is not typically advised to have the schema depend on the data itself. We can, however, avoid any for-by-row iteration and used a vectorised UDF with map_batches

def maybe_cast_int(s: pl.Series) -> pl.Series:
    """Cast the Series to an Int64 type if all values are whole numbers."""
    s2 = s.cast(pl.Int64)
    return s2 if (s2 == s).all() else s

df = pl.DataFrame({
    "date": ["2025-01-01", "2025-01-02"],
    "a": [1.0, 2.0],
    "c": [1.0, 2.1],
})

df.with_columns(pl.col("a", "c").map_batches(maybe_cast_int))
shape: (2, 3)
┌────────────┬─────┬─────┐
│ date       ┆ a   ┆ c   │
│ ---        ┆ --- ┆ --- │
│ str        ┆ i64 ┆ f64 │
╞════════════╪═════╪═════╡
│ 2025-01-01 ┆ 1   ┆ 1.0 │
│ 2025-01-02 ┆ 2   ┆ 2.1 │
└────────────┴─────┴─────┘

This example shows it a bit better by not overwriting original columns

df.select(
    "a",
    pl.col("a").map_batches(maybe_cast_int).alias("b"),
    "c",
    pl.col("c").map_batches(maybe_cast_int).alias("d"),
)
shape: (2, 4)
┌─────┬─────┬─────┬─────┐
│ a   ┆ b   ┆ c   ┆ d   │
│ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ i64 ┆ f64 ┆ f64 │
╞═════╪═════╪═════╪═════╡
│ 1.0 ┆ 1   ┆ 1.0 ┆ 1.0 │
│ 2.0 ┆ 2   ┆ 2.1 ┆ 2.1 │
└─────┴─────┴─────┴─────┘

Upvotes: 2

Related Questions