Reputation: 3
I am writing a Rust using Polars. I would like to know how can I manipulate one string dataframe column.
For example, I have the following dataframe:
Id | Text |
---|---|
1 | Some foo text |
2 | Other text |
And I would like to replace all values that has "foo" by some other value like "new foo" and I wasn't able to find a way to do that. Can anyone help me on this?
I tried to use the function with_column, but I didn't manage how to do it.
Upvotes: 0
Views: 559
Reputation: 24602
You can use replace_literal_all
method. Note that this feature is flagged behind strings
flag.
[dependencies]
polars = { version = "0.38.1", features = ["strings"] }
Then in your Rust code, you can do the transformation like this:
use polars::prelude::*;
fn main() {
let mut df = df! [
"Id" => [1,2],
"Text" => ["Some foo text", "Other text",],
]
.unwrap();
println!("Before = {:?}", df);
let new_column = df["Text"]
.str()
.unwrap()
.replace_literal_all("foo", "new_foo")
.unwrap();
let modified = df.with_column(new_column).unwrap();
println!("After = {:?}", modified);
}
This will produce the following output:
Before = shape: (2, 2)
┌─────┬───────────────┐
│ Id ┆ Text │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═══════════════╡
│ 1 ┆ Some foo text │
│ 2 ┆ Other text │
└─────┴───────────────┘
After = shape: (2, 2)
┌─────┬───────────────────┐
│ Id ┆ Text │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═══════════════════╡
│ 1 ┆ Some new_foo text │
│ 2 ┆ Other text │
└─────┴───────────────────┘
You can use replace_all
API if you want regex based replacement(And please note that it requires regex
feature flag to be enabled).
Upvotes: 1
Reputation: 369
You can do like the following:
use polars::prelude::*;
use polars::df;
use regex::Regex;
fn main() {
// use macro
let mut df = df! [
"Id" => [1,2],
"Text" => ["Some foo text", "Other text",],
].unwrap();
println!("before = {:?}", df);
let re = Regex::new(r"foo").unwrap();
let target = "new foo";
let new_foo_series = df.column("Text").unwrap()
.utf8()
.unwrap()
.apply(|e| re.replace(e, target));
let df2 = df.with_column(new_foo_series).unwrap();
println!("after = {:?}", df2);
}
That will show:
before = shape: (2, 2)
┌─────┬───────────────┐
│ Id ┆ Text │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═══════════════╡
│ 1 ┆ Some foo text │
│ 2 ┆ Other text │
└─────┴───────────────┘
after = shape: (2, 2)
┌─────┬───────────────────┐
│ Id ┆ Text │
│ --- ┆ --- │
│ i32 ┆ str │
╞═════╪═══════════════════╡
│ 1 ┆ Some new foo text │
│ 2 ┆ Other text │
└─────┴───────────────────┘
Ref = https://docs.rs/polars/latest/polars/docs/eager/index.html#apply-functions-closures
Upvotes: 0