Reputation: 33
Everytime I use pandas profiling in different data sets, notebook shows me this error.
IndexError: only integers, slices (
:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices.
import pandas as pd
df = pd.read_csv('H:\DATA Sets\cereal.csv')
from pandas_profiling import ProfileReport
profile = ProfileReport(df,title='cereal-eda',html={'style' : {'full_width':True}})
dataset used - cereal.csv from kaggle https://www.kaggle.com/crawford/80-cereals
Upvotes: 2
Views: 2310
Reputation: 658
Edit: A PR has already been made to fix this. It seems to be an issue using Pandas 1.4.[01] See this issue on pandas-profiling's github.
I think the error occurs because Numpy deprecated indexing arrays in a manner used by one of pandas-profiling's modules.
If you are getting the same traceback I'm getting where this error occurs in pandas_profiling.model.pandas.utils_pandas
, you should be able to fix this by changing:
w_median = data[weights == np.max(weights)][0]
to
w_median = data[np.where(weights == np.max(weights))][0]
In the weighted_median
function in $(YOUR_VIRTUAL_ENVIRONMENT_OR_PYTHON_DIR)/lib/python$(PYVERSION)/site-packages/pandas-profiling/model/pandas/utils_pandas.py
(line 13 for pandas-profiling version 3.1.0)
Upvotes: 6