Reputation: 75
I am trying to group the columns then apply different functions to each column. I referred to the answer here and my code is as shown below
def f(x):
d = {}
d['a'] = x['a'].max()
d['b'] = x['b'].first()
d['c'] = x['c'].last()
return pd.Series(d, index=['a', 'b', 'c'])
require_data = required_data.groupby(['S','id', 'lane', 'timestamp','E']).apply(f)
And I am getting the following error because of first function
TypeError: first() missing 1 required positional argument: 'offset'
But I can run groupby with first fine
require_data = required_data.groupby(['S','id', 'lane', 'timestamp','E']).first()
What is the cause of the error
Upvotes: 1
Views: 65
Reputation: 862661
Better here is use GroupBy.agg
, there is possible pass columns names with aggregate methods GroupBy.first
and
GroupBy.last
:
require_data = (required_data.groupby(['S','id', 'lane', 'timestamp','E'])
.agg({'a':'max', 'b':'first', 'c':'last'}))
If you want to use your own custom function, it's necessary to select by position, with Series.iat
or with
Series.iloc
, but like @Erfan mentioned, thank you:
Using your own custom function is highly discouraged, because of efficiency.
def f(x):
d = {}
d['a'] = x['a'].max()
d['b'] = x['b'].iat[0]
d['c'] = x['c'].iat[-1]
return pd.Series(d, index=['a', 'b', 'c'])
Upvotes: 4