Punter Vicky
Punter Vicky

Reputation: 16982

Pyarrow Dataset read specific columns and specific rows

Is there a way to use pyarrow parquet dataset to read specific columns and if possible filter data instead of reading a whole file into dataframe?

Upvotes: 5

Views: 12546

Answers (1)

Asclepius
Asclepius

Reputation: 63282

As of pyarrow==2.0.0, this is possible at least with pyarrow.parquet.ParquetDataset.

To read specific columns, its read and read_pandas methods have a columns option. You can also do this with pandas.read_parquet.

To read specific rows, its __init__ method has a filters option.

Upvotes: 6

Related Questions