Reputation: 321
If I have a function f
that I am applying to more than once to a set of columns, what's a more Pythonic way of going about it. Right now, what I am doing is this.
newdf=df.groupby(['a', 'b']).apply(lambda x: f(x, 1))
newdf.columns=['1']
newdf['2']=df.groupby(['a', 'b']).apply(lambda x: f(x, 2))
newdf['3']=df.groupby(['a', 'b']).apply(lambda x: f(x, 3))
newdf['4']=df.groupby(['a', 'b']).apply(lambda x: f(x, 4))
Is there a better way of going about it?
Thanks,
Upvotes: 6
Views: 11408
Reputation: 164623
Pandas groupby.apply
accepts arbitrary arguments and keyword arguments, which are passed on to the grouping function. In addition, you can create a dictionary mapping column to argument. Finally, you can also reuse a groupby
object, which can be defined outside your loop.
argmap = {'2': 2, '3': 3, '4': 4}
grouper = df.groupby(['a', 'b'])
for k, v in argmap.items():
newdf[k] = grouper.apply(f, v)
Upvotes: 0
Reputation: 249123
Use agg()
to compute multiple values from a single groupby()
:
df.groupby(['a', 'b']).agg([
('1': lambda x: f(x, 1)),
('2': lambda x: f(x, 2)),
('3': lambda x: f(x, 3)),
('4': lambda x: f(x, 4)),
])
Or equivalently:
df.groupby(['a', 'b']).agg([(str(i), lambda x: f(x, i)) for i in range(1, 5)])
Upvotes: 1
Reputation: 3713
That's pythonic enough for me:
columns_dict = dict()
for i in range(1, 5):
columns_dict[str(i)] = df.groupby(["a", "b"]).apply(lambda x: f(x, i))
pd.DataFrame(columns_dict)
Upvotes: 2
Reputation: 1398
You could do :
pandas.DataFrame([df.groupby(['a','b']).apply(lambda x : f(x,i)) for i in range(1,5)])
Then transpose the new DataFrame if you want to have same column names as the initial dataframe.
Upvotes: 1