Reputation: 145
Loving the Polars library for its fantastic speed and easy syntax!
Struggling with this question - is there an analogue in Polars for the Pandas code below? Would like to replace strings using a dictionary.
Tried using this expression, but it returns 'TypeError: 'dict' object is not callable'
pl.col("List").str.replace_all(lambda key: key,dict())
Trying to replace the Working Pandas code below with a Polars expression
df = pd.DataFrame({'List':[
'Systems',
'Software',
'Cleared'
]})
dic = {
'Systems':'Sys'
,'Software':'Soft'
,'Cleared':'Clr'
}
df["List"] = df["List"].replace(dic, regex=True)
Output:
List
0 Sys
1 Soft
2 Clr
Upvotes: 4
Views: 1396
Reputation: 21544
There is a "stale" feature request for accepting a dictionary:
One possible workaround is to stack multiple expressions in a loop:
expr = pl.col("List")
for old, new in dic.items():
expr = expr.str.replace_all(old, new)
df.with_columns(result = expr)
shape: (3, 2)
┌──────────┬────────┐
│ List ┆ result │
│ --- ┆ --- │
│ str ┆ str │
╞══════════╪════════╡
│ Systems ┆ Sys │
│ Software ┆ Soft │
│ Cleared ┆ Clr │
└──────────┴────────┘
For non-regex cases, there is also .str.replace_many()
:
df.with_columns(
pl.col("List").str.replace_many(
["Systems", "Software", "Cleared"],
["Sys", "Soft", "Clr"]
)
.alias("result")
)
Upvotes: 4
Reputation: 18671
I think your best bet would be to turn your dic into a dataframe and join the two.
You need to convert your dic to the format which will make a nice DataFrame. You can do that as a list of dicts so that you have
dicdf=pl.DataFrame([{'List':x, 'newList':y} for x,y in dic.items()])
where List
is what your column name is and we're arbitrary making newList
our new column name that we'll get rid of later
You'll want to join that with your original df and then select all columns except the old List
plus newList
but renamed to List
df=df.join(
dicdf,
on='List') \
.select([
pl.exclude(['List','newList']),
pl.col('newList').alias('List')
])
Upvotes: 1