stucash
stucash

Reputation: 1258

what is an efficient way of applying a function to one column in a group from groupby object?

I have a dataframe which has 500K rows in it.

I have following columns:

               Symbol      Open      High       Low    Close    Volume

Date                                                                    
01-Aug-2017    AADR   49.8800    49.8800    49.8800    49.8800     790
02-Aug-2017    AADR   49.8432    49.8432    49.8432    49.8432     684

I have 2071 symbols in the dataframe:

>>> grouped = df.groupby('Symbol')

>>> len(grouped)

 2071

I wanted to apply a rolling mean function only on one column (i.e. Close) of each group and add the mean values as an extra column in existing dataframe.

I believe I could do following:

results = {}
for name, group in grouped:
    ma_col = group[1].Close.ewm(span=10, min_periods=10).mean()
    results[name] = ma_col    

this gives me dictionary of results which I could then turn into a DataFrame to use.

Is there a more efficient (better performance) way to do the same thing?

Upvotes: 0

Views: 46

Answers (1)

cs95
cs95

Reputation: 402473

You can use groupby + transform -

df.groupby('Symbol').Close.transform(lambda x: x.ewm(span=10, min_periods=10).mean())

Upvotes: 2

Related Questions