awadhesh204
awadhesh204

Reputation: 93

Percentiles along horizontal rows in Python/Pandas

I need to calculate various percentiles along rows into 1 column in a Pandas dataframe. For example:

df['P90'] = df[['col1','col2','col3','col4','col5']].apply(quantile(0.9), axis=1)
df['P50'] = df[['col1','col2','col3','col4','col5']].apply(quantile(0.5), axis=1)

I have the following dataframe:

ID      2019/31 2019/32 2019/33 2019/34 2019/35 2019/36 2019/37 2019/38 2019/39 2019/40
258101  67000                                   
258102  56750   19105   35990   41250   44425   51275   1071    8125    16375
258103  8528    6853    3291            3000    5640    11248   
258104          27532   19523   12092   7933    8675    435     1045    5115    1450
258105          40000                   285500  16500

I need the output in the following format:

ID      2019/31 2019/32 2019/33 2019/34 2019/35 2019/36 2019/37 2019/38 2019/39 2019/40 P_50    P_90 
258101  67000                                                                           x1      x2
258102  56750   19105   35990   41250   44425   51275   1071    8125    16375           x3      x4
258103  8528    6853    3291            3000    5640    11248                           x5      x6
258104          27532   19523   12092   7933    8675    435     1045    5115    1450    x7      x8
258105          40000                   285500  16500                                   x9      x10

I've tried the following:

cols = ['2019/31', '2019/32', '2019/33', '2019/34', '2019/35', '2019/36', '2019/37', '2019/38', '2019/39', '2019/40']
df['P_50'] = df[cols].apply(np.median, axis=1)
df['P_50'] = df[cols].apply(np.quantile(0.5), axis=1)

perc99 = np.vectorize(lambda x: np.percentile(x, 50))
df['P_50'] = perc99(df[cols].values)

None of it is giving the desired output.

Upvotes: 0

Views: 227

Answers (2)

coco18
coco18

Reputation: 1095

I think, this will work:

df['P_50'] = np.median(df.values, axis=1)
df['P_90'] = np.quantile(df.values, 0.9, axis=1)

Upvotes: 0

GLarose
GLarose

Reputation: 171

I scraped your df from the posting. Is this what your are trying to do?:

df['P_50'] = df.iloc[:,1:-1].median(axis=1)
df['P_50'] = df.iloc[:,1:-1].quantile(0.5, axis=1)

this excluding the ID column

Upvotes: 1

Related Questions