Reputation: 113
I'm learning polars rust version and I have a question: is there a way to create a dataframe (or lazy dataframe) by using a struct?
I have some financial data from a data provider that send me a json through http request.
I deserialise this json into a struct and I'd like to create a polars data frame with this struct.
Alternative, is there an analog of python polars.read_json in rust version?
Upvotes: 1
Views: 387
Reputation: 11
I am assuming after read_json
your dataframe is looking like the below example and trying to answer accordingly.
import polars as pl
exp = {
"a": ["a", "aa","aaa"],
"b": ["b", "bb","bbb"],
"c": ["c", "cc","ccc"]
}
df_exp=pl.DataFrame(exp).select([pl.struct(["a", "b"]).alias("first"),pl.struct(["b", "c"]).alias("second")])
df_exp
Result
shape: (3, 2)
first second
struct[2] struct[2]
{"a","b"} {"b","c"}
{"aa","bb"} {"bb","cc"}
{"aaa","bbb"}{"bbb","ccc"}
If you slice one element of the column first you can see the struct with field names like below.
df_exp['first'][0]
Result
{'a': 'a', 'b': 'b'}
To unnest the dataframe you can use unnest
function to create columns with respect to field like below.
df_exp.unnest('first')
Result
shape: (3, 3)
a b second
str str struct[2]
"a" "b" {"b","c"}
"aa" "bb" {"bb","cc"}
"aaa" "bbb" {"bbb","ccc"}
Here I am attaching a notebook link which is almost same case that you were looking for. https://www.kaggle.com/code/baladevdebasisjena/otto-polars-light-jsonl-to-paraquest-conversion
Upvotes: 1