Reputation: 15
I am trying to understand how to apply function within the 'groupby' or each groups of the groups in a dataframe.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Stock' : ['apple', 'ford', 'google', 'samsung','walmart', 'kroger'],
'Sector' : ['tech', 'auto', 'tech', 'tech','retail', 'retail'],
'Price': np.random.randn(6),
'Signal' : np.random.randn(6)}, columns= ['Stock','Sector','Price','Signal'])
dfg = df.groupby(['Sector'],as_index=False)
type(dfg)
pandas.core.groupby.DataFrameGroupBy
I want to get the sum ( Price * (1/Signal) ) group by 'Sector'. i.e. The resulting output should look like
Sector | Value
auto | 0.744944
retail |-0.572164053
tech | -1.454632
I can get the results by creating separate data frames, but was looking for a way to figure out how to operate withing each of the grouped ( sector) frames.
I can find mean or sum of Price
dfg.agg({'Price' : [np.mean, np.sum] }).head(2)
but not get sum ( Price * (1/Signal) ), which is what I need.
Thanks,
Upvotes: 0
Views: 88
Reputation: 54330
You provided random data, so there is no way we can get the exact number that you got. But based on what you just described, I think the following will do:
In [121]:
(df.Price/df.Signal).groupby(df.Sector).sum()
Out[121]:
Sector
auto -1.693373
retail -5.137694
tech -0.984826
dtype: float64
Upvotes: 1