Pandas: sort/percentile each row within column categories

Question

I was wondering if there is an eloquent way to sort (calculate percentile) across columns in a Pandas dataframe with the following condition:

Do the percentile calculation within each category. Each column will belong to a category and the percentile calculation to be done within each category (please see the link for a graphical description.)

I learned that I can do the following which will disregard the categories:

TargetRanking = StartingData.rank(axis="columns", pct=True)

But I would need to groupby each row by the category of each column. Please see the graphical description at the following link.

miradulo · Accepted Answer

Assuming you had a dict with the category mappings, you could simply group the columns by that dict and then use rank as previously.

categories = {'X1': 'A', 'X3': 'A', 'X5': 'A', 'X2': 'B', 'X4': 'B'}

df.set_index('Date').groupby(categories, axis=1).rank(pct=True)

Pandas: sort/percentile each row within column categories

Answers (1)

Related Questions