Reputation: 79
I'm trying to calculate distance between 2 coordinates based on polars data frame.
import polars as pl
pl.Config.set_fmt_str_lengths(2000)
data={"a": ["782.83 7363.51 6293 40 PD","850.68 7513.1 6262.17 40 PD"], "b": ["795.88 7462.65 6293 40 PD","1061.64 7486.08 6124.85 40 PD"]}
df=pl.DataFrame(data)
df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)).alias('c'))
df
try:
dfNew=df.with_columns((pl.col("a").str.replace_all(r" +"," ").str.split(' ',2)[:2]).alias('c'))
except Exception as e:
print('It\'s not working - ', e)
To calculate the distance, I need only the first 3 values from the list created by splitting it's value using space. When I try to do this I get an error message "'Expr' object is not subscriptable". How can I overcome it? To calculate the distance, I need do the same transformation to column b and using numpy make distance calculation.
I was trying to use list comprehension and lambda but nothing works. How to overcome this? Thanks in advance.
Artur
Upvotes: 2
Views: 1988
Reputation: 1462
To take first three elements from list, you should use .arr.slice()
method and then cast list[str]
to the list of floats list[f64]
to provide further calculations
df.with_columns([
pl.col("a").str.replace_all(r" +", " ")\
.str.split(" ").arr.slice(0,3)\
.cast(pl.List(pl.Float64)).alias("c")
])
Upvotes: 2