Cornelius Roemer
Cornelius Roemer

Reputation: 7859

How to read_csv a zstd-compressed file using python-polars

In contrast to pandas, polars doesn't natively support reading zstd compressed csv files.

How can I get polars to read a csv compressed file, for example using xopen?

I've tried this:

from xopen import xopen
import polars as pl

with xopen("data.csv.zst", "r") as f:
    d = pl.read_csv(f)

but this errors with:

pyo3_runtime.PanicException: Expecting to be able to downcast into bytes from read result.: 
   PyDowncastError

Upvotes: 1

Views: 1541

Answers (2)

Cornelius Roemer
Cornelius Roemer

Reputation: 7859

One needs to xopen the file in binary mode "rb", then it works:

from xopen import xopen
import polars as pl

with xopen("data.csv.zst", "rb") as f:
    d = pl.read_csv(f)

Beware that the entire file will be read into memory before parsing, even if you immediately use only a subset of columns/rows.

Upvotes: 2

ritchie46
ritchie46

Reputation: 14660

polars doesn't natively support reading compressed csv files.

This is not really true. We support decompression for zlib and gzip. You can make a feature request for zstd, then we can look into supporting that as well.

Upvotes: 1

Related Questions