Neeraja Bandreddi
Neeraja Bandreddi

Reputation: 437

groupby function in python

I need to calculate different mathematical operations to the different variables in dataframe. I am having data as shown below:

 y    x1  x2 x3
 NB    1   4   2
 SK    2   5   3
 SK    3   6   6
 NB    4   7   9

I want to group mydata with y variable and have to calculate sum(x1),max(x2).Also, I have to apply some user_defined function to x3.

And I want my grouped output with only 4 variables y,x1,x2,x3 in pandas dataframe format as shown below.

 y    x1  x2 x3
 NB    5   7   5
 SK    5   6   5  

I tried some codes and i searched in different websites but i didn't get a required solution.

please anyone help me to tackle this.

Thanks in advance.

Upvotes: 2

Views: 844

Answers (2)

asongtoruin
asongtoruin

Reputation: 10359

When you use .groupby, you can aggregate with .agg. There are certain predefined functions for use in this, but you can also apply whatever user-defined functions you want using lambda, where the argument passed to the function is the values for that group:

from io import StringIO

import pandas as pd


data = StringIO('''y    x1  x2 x3
NB    1   4   2
SK    2   5   3
SK    3   6   6
NB    4   7   9''')


def func(values):
    return sum(values)/50

df = pd.read_csv(data, sep='\s+')

summaries = df.groupby('y').agg({'x1': 'sum',
                                 'x2': 'max',
                                 'x3': lambda vals: func(vals)})

print(summaries)

This prints:

    x1  x2    x3
y               
NB   5   7  0.22
SK   5   6  0.18

Upvotes: 3

jhurst5
jhurst5

Reputation: 77

df.groupby(df.index)[‘x1’].agg(lambda x: sum(x.values)

You can change the lambda for whichever operation you are performing on a given column.

Upvotes: 0

Related Questions