Reputation: 1634
I am wondering if there's a way to handle conditional assignment in polars dataframe without using numpy related.
import polars as pl
import numpy as np
df = pl.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'C'],
'conference': ['East', 'East', 'East', 'West', 'West', 'East'],
'points': [11, 8, 10, 6, 6, 5],
'rebounds': [7, 7, 6, 9, 12, 8]})
shape: (6, 4)
┌──────┬────────────┬────────┬──────────┐
│ team ┆ conference ┆ points ┆ rebounds │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ i64 │
╞══════╪════════════╪════════╪══════════╡
│ A ┆ East ┆ 11 ┆ 7 │
│ A ┆ East ┆ 8 ┆ 7 │
│ A ┆ East ┆ 10 ┆ 6 │
│ B ┆ West ┆ 6 ┆ 9 │
│ B ┆ West ┆ 6 ┆ 12 │
│ C ┆ East ┆ 5 ┆ 8 │
└──────┴────────────┴────────┴──────────┘
Using numpy, we could do:
conditions = [
df['points'].le(6) & df['rebounds'].le(9),
df['points'].gt(10) & df['rebounds'].gt(6)
]
choicelist = ['Bad','Good']
df.with_columns(rating = np.select(conditions, choicelist, 'Aveg'))
shape: (6, 5)
┌──────┬────────────┬────────┬──────────┬────────┐
│ team ┆ conference ┆ points ┆ rebounds ┆ rating │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ i64 ┆ str │
╞══════╪════════════╪════════╪══════════╪════════╡
│ A ┆ East ┆ 11 ┆ 7 ┆ Good │
│ A ┆ East ┆ 8 ┆ 7 ┆ Aveg │
│ A ┆ East ┆ 10 ┆ 6 ┆ Aveg │
│ B ┆ West ┆ 6 ┆ 9 ┆ Bad │
│ B ┆ West ┆ 6 ┆ 12 ┆ Aveg │
│ C ┆ East ┆ 5 ┆ 8 ┆ Bad │
└──────┴────────────┴────────┴──────────┴────────┘
Upvotes: 1
Views: 1606
Reputation: 14770
You can chain when -> then -> otherwise
expressions.
df.with_columns(
pl.when((pl.col("points") <= 6) & (pl.col("rebounds") <= 9))
.then(pl.lit("Bad"))
.when((pl.col("points") > 10) & (pl.col("rebounds") > 6))
.then(pl.lit("Good"))
.otherwise(pl.lit("Aveg"))
.alias("rating")
)
shape: (6, 5)
┌──────┬────────────┬────────┬──────────┬────────┐
│ team ┆ conference ┆ points ┆ rebounds ┆ rating │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ str ┆ i64 ┆ i64 ┆ str │
╞══════╪════════════╪════════╪══════════╪════════╡
│ A ┆ East ┆ 11 ┆ 7 ┆ Good │
│ A ┆ East ┆ 8 ┆ 7 ┆ Aveg │
│ A ┆ East ┆ 10 ┆ 6 ┆ Aveg │
│ B ┆ West ┆ 6 ┆ 9 ┆ Bad │
│ B ┆ West ┆ 6 ┆ 12 ┆ Aveg │
│ C ┆ East ┆ 5 ┆ 8 ┆ Bad │
└──────┴────────────┴────────┴──────────┴────────┘
when
also accepts *args
which are implicitly combined using &
which may be preferred in this case:
.when(pl.col("points") <= 6, pl.col("rebounds") <= 9)
Upvotes: 3