Reputation: 1087
I have pandas dataframe on which I need to some data manipulation, the following code provide me the average of column "Variable" group by "Key":
df.groupby('key').Variable.transform("mean")
The advantage of using "transform" is that it return back the result with the same index which is pretty useful.
Now, I want to have my customize function and use it within "transform" instead of "mean" more over my function need two or more column something like:
lambda (Variable, Variable1, Variable2): (Variable + Variable1)/Variable2
(actual function of mine is more complicated than this example) and each row of my dataframe has Variable,Variable1 and Variable2.
I am wondering if I can define and use such a customized function within "transform" to be able to rerun the result back with same index?
Thanks, Amir
Upvotes: 2
Views: 2444
Reputation: 46
Why didn't you use simple
df.Variable + df.Variable1 / df.Variable2
There is no need to groupby. In case for example you want to divide by df.groupby('key').Variable2.transform("mean")
you can still do it with transform as following:
df.Variable + df.Variable1 / df.groupby('key').Variable2.transform("mean")
Upvotes: 0
Reputation: 32105
Don't call transform against Variable
, call it on the grouper and then call your variables against the dataframe the function receives as argument:
df.groupby('key').transform(lambda x: (x.Variable + x.Variable1)/x.Variable2)
Upvotes: 2