Reputation: 372
I am trying to read a parquet file using Python 3.6.
import pandas as pd
df = pd.read_parquet('smalldata.parquet')
df.head()
However, this is generating an error that module pandas has no attribute read_parquet. What dependencies should I cater in order to solve this problem?
Edit 1:
I updated Pandas and this is the stacktrace
Requirement already up-to-date: pandas in /home/fatima/miniconda2/lib/python2.7/site-packages (0.24.2)
Requirement already satisfied, skipping upgrade: pytz>=2011k in /home/fatima/miniconda2/lib/python2.7/site-packages (from pandas) (2018.9)
Requirement already satisfied, skipping upgrade: numpy>=1.12.0 in /home/fatima/miniconda2/lib/python2.7/site-packages (from pandas) (1.16.2)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.5.0 in /home/fatima/miniconda2/lib/python2.7/site-packages (from pandas) (2.8.0)
Requirement already satisfied, skipping upgrade: six>=1.5 in /home/fatima/miniconda2/lib/python2.7/site-packages (from python-dateutil>=2.5.0->pandas) (1.12.0)
Edit 2: this is what conda list gives me
pandas 0.24.2 pypi_0 pypi
Upvotes: 2
Views: 7304
Reputation: 97
You will need to install the required packages:
pip install pandas pyarrow s3fs fastparquet
Upvotes: 1
Reputation: 655
If you are trying to read Parquet files in Pandas, it may be that you don't have one of the engines installed for reading Parquet files, such as pyarrow
or fastparquet
. You would need to install those dependencies as Pandas read_parquet
requires either of these engines in order to read Parquet files. For each of those dependencies, you would also need to figure out which dependencies are required for installing each of those libraries.
If this isn't the issue, can you please comment on what the error you are encountering may be?
Upvotes: 0