Pranjal Tiwari
Pranjal Tiwari

Reputation: 33

How to fix this error while using pandas profiling in jupyter notebook

Everytime I use pandas profiling in different data sets, notebook shows me this error.

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices.

import pandas as pd

df = pd.read_csv('H:\DATA Sets\cereal.csv')

from pandas_profiling import ProfileReport

profile = ProfileReport(df,title='cereal-eda',html={'style' : {'full_width':True}})

dataset used - cereal.csv from kaggle https://www.kaggle.com/crawford/80-cereals

Upvotes: 2

Views: 2310

Answers (1)

jrbergen
jrbergen

Reputation: 658

Edit: A PR has already been made to fix this. It seems to be an issue using Pandas 1.4.[01] See this issue on pandas-profiling's github.

I think the error occurs because Numpy deprecated indexing arrays in a manner used by one of pandas-profiling's modules.

If you are getting the same traceback I'm getting where this error occurs in pandas_profiling.model.pandas.utils_pandas, you should be able to fix this by changing:

w_median = data[weights == np.max(weights)][0]

to

w_median = data[np.where(weights == np.max(weights))][0]

In the weighted_median function in $(YOUR_VIRTUAL_ENVIRONMENT_OR_PYTHON_DIR)/lib/python$(PYVERSION)/site-packages/pandas-profiling/model/pandas/utils_pandas.py

(line 13 for pandas-profiling version 3.1.0)

Upvotes: 6

Related Questions