Siniara
Siniara

Reputation: 21

Pandas: How to return multiple columns with a custom apply function on a groupby object

The basic idea is that I have a computation that involves multiple columns from a dataframe and returns multiple columns, which I'd like to integrate in the dataframe. I'd like to do something like this:

df = pd.DataFrame({'id':['i1', 'i1', 'i2', 'i2'], 'a':[1,2,3,4], 'b':[5,6,7,8]})

def custom_f(a, b):
    computation = a+b
    return computation + 1, computation*2

df['c1'], df['c2'] = df.groupby('id').apply(lambda x: custom_f(x.a, x.b))

Desired output:

    id  a   b  c1     c2
0   i1  1   5  7      12
1   i1  2   6  9      16
2   i2  3   7  11     20
3   i2  4   8  13     24

I know how I could do this one column at a time, but in reality the 'computation' operation using the two columns is quite expensive so I'm trying to figure out how I could only run it once.

EDIT: I realised that the given example can be solved without the groupby, but for my use case for the actual 'computation' I'm doing the groupby because I'm using the first and last values of arrays in each group for my computation. For the sake of simplicity I omitted that, but imagine that it is needed.

Upvotes: 2

Views: 2982

Answers (2)

Anurag Dabas
Anurag Dabas

Reputation: 24314

you can try:

def custom_f(a, b):
    computation = a+b
    return pd.concat([(computation + 1),(computation*2)],axis=1)

Finally:

df[['c1','c2']]=df.groupby('id').apply(lambda x: custom_f(x.a, x.b)).values

output of df:

    id  a   b   c1  c2
0   i1  1   5   7   12
1   i1  2   6   9   16
2   i2  3   7   11  20
3   i2  4   8   13  24

Upvotes: 2

Muhammad Rasel
Muhammad Rasel

Reputation: 724

df['c1'], df['c2'] = custom_f(df['a'], df['b']) # you dont need apply for your desired output here

Upvotes: 1

Related Questions