KaSan
KaSan

Reputation: 47

Data analysis pandas

I am developing an analysis of a very extensive dataset, the dataset has the attributes (g, month, p), which are organized by group using groupby of pandas.

G   month   p
G1  1   0.040698496
G1  2   0.225640771
G1  3   0.236948047
G1  4   0.119339576
G1  5   0.779272432
G2  1   0.892168636
G2  2   0.062467967
G2  3   0.936044226
G3  1   0.509212613
G3  2   0.476718744
G3  3   0.407299543
G3  4   0.843260893
G4  1   0.882554249

I then extracted the statistics by group G from 1 to n as shown below

    g1  g2  g3  gn
mean    0.280379864 0.630226943 0.559122948 …
std 0.290326376 0.49218285  0.194135874 …
count   5   3   4   …

it is required to create a new field that is the product of the group average by the variable p, there is some way to make it automatic ..., due to the extension (more than 200 groups), do it individually taking a lot of time. the expected output is

G   month   p   STD*p
G1  1   0.040698496 0.011815847
G1  2   0.225640771 0.065509467
G1  3   0.236948047 0.068792268
G1  4   0.119339576 0.034647427
G1  5   0.779272432 0.226243341
G2  1   0.892168636 0.439110102
G2  2   0.062467967 0.030745662
G2  3   0.936044226 0.460704915
G3  1   0.509212613 0.098856436
G3  2   0.476718744 0.09254821
G3  3   0.407299543 0.079071453
G3  4   0.843260893 0.16370719

Upvotes: 1

Views: 52

Answers (1)

jezrael
jezrael

Reputation: 862681

Use GroupBy.transform with std for repeating aggregate values, so is possible multiple by p column:

df['STD*p'] = df.groupby('G')['p'].transform('std').mul(df['p'])
print (df)
     G  month         p     STD*p
0   G1      1  0.040698  0.011816
1   G1      2  0.225641  0.065509
2   G1      3  0.236948  0.068792
3   G1      4  0.119340  0.034647
4   G1      5  0.779272  0.226243
5   G2      1  0.892169  0.439110
6   G2      2  0.062468  0.030746
7   G2      3  0.936044  0.460705
8   G3      1  0.509213  0.098856
9   G3      2  0.476719  0.092548
10  G3      3  0.407300  0.079071
11  G3      4  0.843261  0.163707
12  G4      1  0.882554       NaN

Upvotes: 0

Related Questions