Reputation: 99
I wanted to read -> update -> write parquet files using python 2.7 or less version. facing issue related to packages. please let me know the correct way to do the same.
Upvotes: 1
Views: 11688
Reputation: 8796
You can use pyarrow
to read Parquet files with Python 2.7, see https://arrow.apache.org/docs/python/parquet.html Note that there are no Python 2.7 wheels available for Windows. You either need to use conda
there or switch to Linux / OSX.
Read Parquet files:
import pyarrow.parquet as pq
table = pq.read_table("file.parquet")
# Optionally convert to Pandas DataFrame
df = table.to_pandas()
Write Parquet files:
import pyarrow as pa
import pyarrow.parquet as pq
# If your input data is a Pandas DataFrame, we need to convert it to an Arrow table first.
table = pa.Table.from_pandas(df)
pq.write_table(table, "filename.parquet")
Upvotes: 1