emax
emax

Reputation: 7245

Python: how to groupby a given percentile?

I have a dataframe df

df
    User   City     Job             Age
0    A      x    Unemployed         33
1    B      x     Student           18
2    C      x    Unemployed         27
3    D      y  Data Scientist       28
4    E      y    Unemployed         45
5    F      y     Student           18

I want to groupby the City and do some stat. If I have to compute the mean, I can do the following:

tmp = df.groupby(['City']).mean()

I would like to do same by a specific quantile. Is it possible?

Upvotes: 2

Views: 241

Answers (4)

heena bawa
heena bawa

Reputation: 828

You can use:

df.groupby('City')['Age'].apply(lambda x: np.percentile(x,[25,75])).reset_index().rename(columns={'Age':'25%, 75%'})

  City      25%, 75%
0    x  [22.5, 30.0]
1    y  [23.0, 36.5]

Upvotes: 1

BENY
BENY

Reputation: 323226

I am using describe

df.groupby('City')['Age'].describe()[['25%','75%']]
Out[542]: 
       25%   75%
City            
x     22.5  30.0
y     23.0  36.5

Upvotes: 1

jezrael
jezrael

Reputation: 862511

I believe you need DataFrameGroupBy.quantile:

tmp = df.groupby('City')['Age'].quantile(0.4)
print (tmp)
City
x    25.2
y    26.0
Name: Age, dtype: float64

tmp = df.groupby('City')['Age'].quantile([0.25, 0.75]).unstack().add_prefix('q')
print (tmp)
      q0.25  q0.75
City              
x      22.5   30.0
y      23.0   36.5

Upvotes: 3

warwick12
warwick12

Reputation: 316

def q1(x):
    return x.quantile(0.25)

def q2(x):
    return x.quantile(0.75)

fc = {'Age': [q1,q2]}
temp = df.groupby('City').agg(fc)
temp

       Age      
        q1    q2
City            
x     22.5  30.0
y     23.0  36.5

Upvotes: 4

Related Questions