Operations within DataFrameGroupBy

Question

I am trying to understand how to apply function within the 'groupby' or each groups of the groups in a dataframe.

import pandas as pd
import numpy as np
df = pd.DataFrame({'Stock' : ['apple', 'ford', 'google', 'samsung','walmart', 'kroger'],
                   'Sector' : ['tech', 'auto', 'tech', 'tech','retail', 'retail'],
                   'Price': np.random.randn(6),
                   'Signal' : np.random.randn(6)},  columns= ['Stock','Sector','Price','Signal'])
dfg = df.groupby(['Sector'],as_index=False)

type(dfg)
pandas.core.groupby.DataFrameGroupBy

I want to get the sum ( Price * (1/Signal) ) group by 'Sector'. i.e. The resulting output should look like

Sector  |   Value

auto    | 0.744944

retail  |-0.572164053

tech    | -1.454632

I can get the results by creating separate data frames, but was looking for a way to figure out how to operate withing each of the grouped ( sector) frames.

I can find mean or sum of Price

dfg.agg({'Price' : [np.mean, np.sum] }).head(2)

but not get sum ( Price * (1/Signal) ), which is what I need.

Thanks,

CT Zhu · Accepted Answer

You provided random data, so there is no way we can get the exact number that you got. But based on what you just described, I think the following will do:

In [121]:

(df.Price/df.Signal).groupby(df.Sector).sum()
Out[121]:
Sector
auto     -1.693373
retail   -5.137694
tech     -0.984826
dtype: float64

Operations within DataFrameGroupBy

Answers (1)

Related Questions