filippo
filippo

Reputation: 5294

Pandas, replace rows by mean over given columns

I'm pretty new to Pandas and unfortunately at the moment I don't have much time to dig into it as I would like.

I have a dataframe like this:

   x  y  z  class     id  other-numeric-field
0  8  8  5      1  1014f             0.388640
1  2  3  4      0  3ba1d             0.431008
2  5  1  6      1  1014f             0.388640
3  7  9  6      1  1014f             0.388640
4  6  9  1      0  7a5d7             0.476972

I'd like to replace all rows with the same class with their mean over ['x', 'y', 'z'] columns.

Dataframe can contain other columns, numeric or not, which are usually all equal over the same class but that I don't really care to lose if they are not. I could keep the first occurrence or just average over them too if it works with non numeric field also.

Upvotes: 0

Views: 1189

Answers (2)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

You might be looking for agg i.e

ndf = df.groupby('class').agg({'x':'mean','y':'mean','z':'mean','id':'first','other-numeric-field':'first'})

          id  other-numeric-field         x         z  y
class                                                   
0      3ba1d             0.431008  4.000000  2.500000  6
1      1014f             0.388640  6.666667  5.666667  6

To apply this only for class zero, one approach is appending i.e

ndf = df.groupby('class',as_index=False).agg({'x':'mean','y':'mean','z':'mean','id':'first','other-numeric-field':'first'})

sdf = df[df['class'].ne(0)].append(ndf[ndf['class'].eq(0)],ignore_index=True)

 class     id  other-numeric-field    x  y    z
0      1  1014f             0.388640  8.0  8  5.0
1      1  1014f             0.388640  5.0  1  6.0
2      1  1014f             0.388640  7.0  9  6.0
3      0  3ba1d             0.431008  4.0  6  2.5

Upvotes: 4

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Is that what you want?

In [18]: df[['x','y','z']] = df.groupby('class')[['x','y','z']].transform('mean')

In [19]: df
Out[19]:
          x  y         z  class     id  other-numeric-field
0  6.666667  6  5.666667      1  1014f             0.388640
1  4.000000  6  2.500000      0  3ba1d             0.431008
2  6.666667  6  5.666667      1  1014f             0.388640
3  6.666667  6  5.666667      1  1014f             0.388640
4  4.000000  6  2.500000      0  7a5d7             0.476972

Upvotes: 5

Related Questions