How to group dataframe by column and receive new column for every group

Question

I have the following dataframe:

df = pd.DataFrame({'timestamp' : [10,10,10,20,20,20], 'idx': [1,2,3,1,2,3], 'v1' : [1,2,4,5,1,9], 'v2' : [1,2,8,5,1,2]})

    timestamp   idx     v1  v2
0   10           1      1   1
1   10           2      2   2
2   10           3      4   8
3   20           1      5   5
4   20           2      1   1
5   20           3      9   2

I'd like to group data by timestamp and calculate the following cumulative statistic: np.sum(v1*v2) for every timestamp. I'd like to see the following result:

    timestamp   idx     v1  v2  stat
0   10           1      1   1   37
1   10           2      2   2   37
2   10           3      4   8   37
3   20           1      5   5   44
4   20           2      1   1   44
5   20           3      9   2   44

I'm trying to do the following:

def calc_some_stat(d):
    return np.sum(d.v1 * d.v2)

df.loc[:, 'stat'] = df.groupby('timestamp').apply(calc_some_stat)

But for stat columns I receive all NaN values - what is wrong in my code?

How to group dataframe by column and receive new column for every group

Answers (1)

Related Questions