Linda
Linda

Reputation: 321

pandas groupby and apply function on multiple columns

If I have a function f that I am applying to more than once to a set of columns, what's a more Pythonic way of going about it. Right now, what I am doing is this.

newdf=df.groupby(['a', 'b']).apply(lambda x: f(x, 1))
newdf.columns=['1']
newdf['2']=df.groupby(['a', 'b']).apply(lambda x: f(x, 2))
newdf['3']=df.groupby(['a', 'b']).apply(lambda x: f(x, 3))
newdf['4']=df.groupby(['a', 'b']).apply(lambda x: f(x, 4))

Is there a better way of going about it?

Thanks,

Upvotes: 6

Views: 11408

Answers (4)

jpp
jpp

Reputation: 164623

Pandas groupby.apply accepts arbitrary arguments and keyword arguments, which are passed on to the grouping function. In addition, you can create a dictionary mapping column to argument. Finally, you can also reuse a groupby object, which can be defined outside your loop.

argmap = {'2': 2, '3': 3, '4': 4}

grouper = df.groupby(['a', 'b'])

for k, v in argmap.items():
    newdf[k] = grouper.apply(f, v)

Upvotes: 0

John Zwinck
John Zwinck

Reputation: 249123

Use agg() to compute multiple values from a single groupby():

df.groupby(['a', 'b']).agg([
    ('1': lambda x: f(x, 1)),
    ('2': lambda x: f(x, 2)),
    ('3': lambda x: f(x, 3)),
    ('4': lambda x: f(x, 4)),
])

Or equivalently:

df.groupby(['a', 'b']).agg([(str(i), lambda x: f(x, i)) for i in range(1, 5)])

Upvotes: 1

koPytok
koPytok

Reputation: 3713

That's pythonic enough for me:

columns_dict = dict()
for i in range(1, 5):
    columns_dict[str(i)] = df.groupby(["a", "b"]).apply(lambda x: f(x, i))

pd.DataFrame(columns_dict)

Upvotes: 2

Neroksi
Neroksi

Reputation: 1398

You could do :

pandas.DataFrame([df.groupby(['a','b']).apply(lambda x : f(x,i)) for i in range(1,5)])

Then transpose the new DataFrame if you want to have same column names as the initial dataframe.

Upvotes: 1

Related Questions