Reputation: 1428
I want to pass 2 methods to transform
method of pandas as the API says it can pass a list of functions or dict of column names -> functions. I pass a list of functions, but it does not work:
import pandas as pd
import numpy as np
df = pd.DataFrame({'rti':['a','a','b','c','b','c','a'],'ts':[10,10,9,12,9,13,11],'rs':[8,8,22,11,12,11,9]})
df.groupby('rti').transform(['mean','sum'])
It shows:"TypeError: unhashable type: 'list'"
Upvotes: 1
Views: 117
Reputation: 109546
It works, but the functions used must not be aggregation functions such as sum, max or min.
>>> df.transform([np.abs, np.sign])
rs ts
absolute sign absolute sign
0 8 1 10 1
1 8 1 10 1
2 22 1 9 1
3 11 1 12 1
4 12 1 9 1
5 11 1 13 1
6 9 1 11 1
Refer to the documentation here. Note that the transform
method for groupby
objects accepts only a function (not a list of functions which is for the dataframe transform method).
Per the doc string of the tranform
method of groupby
objects:
Signature: gb.transform(func, *args, **kwargs)
Docstring: Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values
Parameters
f : function Function to apply to each group
Notes
Each group is endowed the attribute 'name' in case you need to know which group you are working on.
The current implementation imposes three requirements on f:
- f must return a value that either has the same shape as the input subframe or can be broadcast to the shape of the input subframe. For example, f returns a scalar it will be broadcast to have the same shape as the input subframe.
- if this is a DataFrame, f must support application column-by-column in the subframe. If f also supports application to the entire subframe, then a fast path is used starting from the second chunk.
- f must not mutate groups. Mutation is not supported and may produce unexpected results.
Upvotes: 1
Reputation: 323226
Seems like transform
do not accept list of function , Open issue in github
df.groupby('rti').agg(['mean','sum']).reindex(df.rti)
Out[12]:
rs ts
mean sum mean sum
rti
a 8.333333 25 10.333333 31
a 8.333333 25 10.333333 31
b 17.000000 34 9.000000 18
c 11.000000 22 12.500000 25
b 17.000000 34 9.000000 18
c 11.000000 22 12.500000 25
a 8.333333 25 10.333333 31
Upvotes: 2