Reputation: 18786

How to groupby pandas DataFrame by customized function

have a dataframe of the form

I want to groupby the initial number of col1, applying mean

basically result should be

col1    sum
8       1.5
3       3
7       5

what I have tried is

def group_condition(col1):
    col1 = str(col1)
    if col1.startswith('8'):
        return 'y'
    else:
        return 'n'


augmented_error_table[[sum]].groupby(augmented_error_table[col1].groupby(group_condition).groups).mean()

But it doesn't work out, give me empty df

Upvotes: 0

Answers (3)

Bharath M Shetty

Reputation: 30605

Use astype(str) in groupby like .

df.groupby(df['col1'].astype(str).str[0])['sum'].mean()

Ouptut :

Upvotes: 2

Jianxun Li

Reputation: 24752

import pandas as pd
import numpy as np

df = pd.DataFrame(dict(col1=[801,802,391,701], sum=[1,2,3,5]))
# work out initial digit by list comprehension
df['init_digit'] = [str(x)[0] for x in df.col1]
# use groupby, agg function apply to sum column only
df.groupby(['init_digit']).agg({'sum':mean})

Out[23]: 
            sum
init_digit     
3           3.0
7           5.0
8           1.5

Upvotes: 0

maxymoo

Reputation: 36555

I think the problem is that that groupby actually needs a series, not a function as input, something like this

table.groupby(group_condition(table[col1]))

Upvotes: 0

How to groupby pandas DataFrame by customized function

Answers (3)

Related Questions