robertdj
robertdj

Reputation: 1117

Convert Polars dataframe to vector of structs

I am making a Maturin project involving Polars on both the Python and Rust side.

In Python I have a dataframe with columns a and b:

import polars as pl
df = pl.DataFrame({'a': [1, 2], 'b': ['foo', 'bar']})

In Rust I have a struct MyStruct with the fields a and b:

struct MyStruct {
  a: i64
  b: String
}

I would like to convert each row in the dataframe to an instance of MyStruct, mapping the dataframe to a vector of MyStructs. This should be done on the Rust side.

I can get this done on the Python side (assuming MyStruct is exposed as a pyclass). First by getting a list of Python dicts and then constructing a Python list of MyStruct.

df_as_list = df.to_struct'MyStruct').to_list()
[MyStruct(**x) for x in df_as_list]

To spice things up a bit more, imagine that MyStruct has an enum field instead of a String field:

enum MyEnum {
  Foo
  Bar
}
struct MyStruct {
  a: i64
  b: MyEnum
}

With a suitable function string_to_myenum that maps strings to MyEnum (that is, "foo" to Foo and "bar" to Bar) it would be great to map the dataframe to the new MyStruct.

Upvotes: 1

Views: 614

Answers (1)

Chayim Friedman
Chayim Friedman

Reputation: 70900

Zip the columns together:

let arr: Vec<MyStruct> = df["a"]
    .i64()
    .expect("`a` column of wrong type")
    .iter()
    .zip(df["b"].str().expect("`b` column of wrong type").iter())
    .map(|(a, b)| {
        Some(MyStruct {
            a: a?,
            b: b?.to_owned(),
        })
    })
    .collect::<Option<Vec<_>>>()
    .expect("found unexpected null");

Note, however, that like I said in the comments, this will be slow, especially for large DataFrames. Prefer to do things using the Polars APIs where possible.

Upvotes: 1

Related Questions