Roman
Roman

Reputation: 131088

How to apply a function to several columns of a GroupBy object?

Let us assume that we have a GroupBy object that was obtained as a result of groupby operation applied to a DataFrame:

grouped = data_frame.groupy(['col_1', 'col_2'])

We can generate a new data frame if we specify how values in the GroupBy object should be combined to get single values. For example:

grouped.agg('col_3':sum, 'col_4':min, 'col_5':user_defined_function)

In the above example we used functions that take lists (or, more precisely, series) as input and return a single value as an output. This is nice but what I need is to use two series as an input. For example, I want to take values from col_3 and col_4 and use them to generate a single values.

For example I might want to find out what is the maximal absolute difference between the corresponding values in col_3 and col_4.

Is there a way to do that in pandas?

Upvotes: 2

Views: 122

Answers (1)

Rutger Kassies
Rutger Kassies

Reputation: 64443

If you dont specify a function per column, all columns will be passed to the function (for both apply and agg). So:

data_frame.groupy(['col_1', 'col_2']).apply(lambda x: np.max(np.abs(x['col_3'] - x['col_4'])))

That gives the absolute maximum difference between col_3 and col_4 for each group.

Upvotes: 3

Related Questions