Reputation: 2785
This is a problem that I found recently and I enjoy very much to design a solution. I think that it is a nice exercise and a good example of the combination of for loops and pandas.
I have the following pandas data frame, with three columns: ['runs','Period_change', 'Close', 'sum'].
I want to implement an alternative to the following for loop to speed up the program.
import pandas as pd
game = pd.read_csv('test.csv').set_index('timestamp').dropna()
for l in game['runs'].unique():
game['sum'][game['runs'] == l] = game[game['runs'] == l]['Period_change'].sum()/game[game['runs'] == l]['Close'].iloc[0]
Upvotes: 0
Views: 55
Reputation: 323226
IIUC, your for loop can be replace by groupby
df['sum']=df.name.map(df.groupby('name').apply(lambda x : x['Period_change'].sum()/x['Close'].iloc[0]))
Upvotes: 1