Compute row average by another column in Python/Pandas

Question

I have a data frame as below:

user | profit
-------------
Anna |    1.0
Bell |    2.0
Anna |    2.0
Chad |    5.0
Bell |    4.0
Anna |    3.0

that I need to compute each row's mean value on the users' level, that is, each time I see the same user I compute his/her profit mean thus far.

For instance, Anna's first profit mean is 1.0 and her second profit mean becomes 1.5, and so on.

The desired result looks like:

user | profit | mean
--------------------
Anna |    1.0 |  1.0
Bell |    2.0 |  2.0
Anna |    2.0 |  1.5
Chad |    5.0 |  5.0
Bell |    4.0 |  3.0
Anna |    3.0 |  2.0

Any suggestions to do so in Python/Pandas?

import pandas as pd

record = pd.DataFrame({
    "user": ("Anna", "Bell", "Anna", "Chad", "Bell", "Anna"), 
    "profit": (1.0, 2.0, 2.0, 5.0, 4.0, 3.0)
})

Thanks!

jezrael · Accepted Answer

Use GroupBy.transform with rolling and mean:

df['mean'] = (df.groupby('user')['profit']
                .transform(lambda x: x.rolling(len(x), min_periods=1).mean()))
print (df)
   user  profit  mean
0  Anna     1.0   1.0
1  Bell     2.0   2.0
2  Anna     2.0   1.5
3  Chad     5.0   5.0
4  Bell     4.0   3.0
5  Anna     3.0   2.0

Compute row average by another column in Python/Pandas

Answers (2)

Related Questions