Reputation: 474
I need to open a gzipped file, that has a parquet file inside with some data. I am having so much trouble trying to print/read what is inside the file. I tried the following:
with gzip.open("myFile.parquet.gzip", "rb") as f:
data = f.read()
This does not seem to work, as I get an error that my file id not a gz file. Thanks!
Upvotes: 10
Views: 26843
Reputation: 9494
You can use read_parquet
function from pandas
module:
pandas
and pyarrow
:pip install pandas pyarrow
read_parquet
which returns DataFrame
:data = read_parquet("myFile.parquet.gzip")
print(data.count()) # example of operation on the returned DataFrame
Upvotes: 15