Reputation: 427
I have a data set with three columns. Column A is to be checked for strings. If the string matches foo
or spam
, the values in the same row for the other two columns L
and G
should be changed to XX
. For this I have tried the following.
df = pl.DataFrame(
{
"A": ["foo", "ham", "spam", "egg",],
"L": ["A54", "A12", "B84", "C12"],
"G": ["X34", "C84", "G96", "L6",],
}
)
print(df)
shape: (4, 3)
┌──────┬─────┬─────┐
│ A ┆ L ┆ G │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo ┆ A54 ┆ X34 │
│ ham ┆ A12 ┆ C84 │
│ spam ┆ B84 ┆ G96 │
│ egg ┆ C12 ┆ L6 │
└──────┴─────┴─────┘
expected outcome
shape: (4, 3)
┌──────┬─────┬─────┐
│ A ┆ L ┆ G │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo ┆ XX ┆ XX │
│ ham ┆ A12 ┆ C84 │
│ spam ┆ XX ┆ XX │
│ egg ┆ C12 ┆ L6 │
└──────┴─────┴─────┘
I tried this
df = df.with_columns(
pl.when((pl.col("A") == "foo") | (pl.col("A") == "spam"))
.then((pl.col("L")= "XX") & (pl.col( "G")= "XX"))
.otherwise((pl.col("L"))&(pl.col( "G")))
)
However, this does not work. Can someone help me with this?
Upvotes: 9
Views: 8419
Reputation: 21534
For setting multiple columns to the same value you could use:
df.with_columns(
pl.when(pl.col("A").is_in(["foo", "spam"]))
.then(pl.lit("XX"))
.otherwise(pl.col("L", "G"))
.name.keep()
)
shape: (4, 3)
┌──────┬─────┬─────┐
│ A ┆ L ┆ G │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo ┆ XX ┆ XX │
│ ham ┆ A12 ┆ C84 │
│ spam ┆ XX ┆ XX │
│ egg ┆ C12 ┆ L6 │
└──────┴─────┴─────┘
.is_in()
can be used instead of multiple == x | == y
chainsIf you need different values, you can pack them into a struct
and extract/unnest the fields.
df.with_columns(
pl.when(pl.col("A").is_in(["foo", "spam"]))
.then(pl.struct(L = pl.lit("AAA"), G = pl.lit("BBB")))
.otherwise(pl.struct("L", "G"))
.struct.field("L", "G") # or .struct.unnest()
)
shape: (4, 3)
┌──────┬─────┬─────┐
│ A ┆ L ┆ G │
│ --- ┆ --- ┆ --- │
│ str ┆ str ┆ str │
╞══════╪═════╪═════╡
│ foo ┆ AAA ┆ BBB │
│ ham ┆ A12 ┆ C84 │
│ spam ┆ AAA ┆ BBB │
│ egg ┆ C12 ┆ L6 │
└──────┴─────┴─────┘
Upvotes: 10