loretoparisi
loretoparisi

Reputation: 16271

pandas- calculate percentile (quantile) of grouped columns

My dataframe looks like

lang score
en    0.7
fr    0.4
en    0.3
...
it    0.7
fr    0.2
de    0.5
...

I want to get the percentile (Pandas quantile) of the score col grouped by the lang col, so I calculate mean, median and percentile as follows:

mean = df.groupby('lang')['score'].mean().sort_values(ascending=False)
median = df.groupby('lang')['score'].median().sort_values(ascending=False)
perc = df.groupby('lang')['score'].quantile(np.linspace(.1, 1, 9, 0))

While mean and median are correct, I get a NaN for the quantile col:

fr                       0.1                    NaN
                         0.2                    NaN
                         0.3                    NaN
                         0.4                    NaN
                         0.5                    NaN
...                                             ...
en                       0.5                    NaN
                         0.6                    NaN
                         0.7                    NaN
                         0.8                    NaN
                         0.9                    NaN

Where is the error?

Upvotes: 0

Views: 617

Answers (1)

shoegazerstella
shoegazerstella

Reputation: 69

Could be you have NaNs in your dataframe?

Try executing this before the perc computation:

df.dropna(subset=['score'])

Upvotes: 1

Related Questions