ValueError in rank method in pandas without more explanation

Question

I have a pandas Dataframe like this :

     year   week           city  avg_rank
0    2016     52          Paris         1
1    2016     52 Gif-sur-Yvette         2
2    2016     52          Paris         1
3    2017      1          Paris         4
4    2016     52          Paris         3
5    2016     52          Paris         5
6    2016     52          Paris         2

But this code line :

df['real_index']=df.groupby(by=['year', 'week', 'city']).avg_rank.rank(method='first')

generates that stack trace :

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in rank(self, axis, method, numeric_only, na_option, ascending, pct)

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in wrapper(*args, **kwargs)
590                                                                 *args, **kwargs)
591                         except(AttributeError):
592                             raise ValueError
593
594             return wrapper

ValueError:

I have no NaN value in those columns of my DataFrame.

I am using python2.7 along with pandas 0.18.1 and numpy 1.11.0.

The shape of my DataFrame is consisting of about 9.000.000 rows and 15 columns.

What is more intriguing is that when I execute this code line in all subsets of my DataFrame (for each subset of 1.000.000 rows), I don't raise any ValueError.

Is that a known behavior that pandas does not manage well quite big DataFrame or did I miss something ?

Any help is welcome !

Thibaut Loiseleur · Accepted Answer

Since my DataFrame came from several files, I noticed that some indexes were duplicated.

With

df.index = np.arange(df.shape[0])

just after loading the data, it now works.

Indeed, my hypothesis is that in some groups in the groupby there were sometimes rows with same indexing.

When I tried with subsets of my DataFrame, this case fortunately/unfortunately never happened.

However, the error message is not very exhaustive.

ValueError in rank method in pandas without more explanation

Answers (2)

Related Questions