nunodsousa
nunodsousa

Reputation: 2785

Optimization of a for loop with pandas

This is a problem that I found recently and I enjoy very much to design a solution. I think that it is a nice exercise and a good example of the combination of for loops and pandas.

I have the following pandas data frame, with three columns: ['runs','Period_change', 'Close', 'sum'].

enter image description here

I want to implement an alternative to the following for loop to speed up the program.

import pandas as pd

game = pd.read_csv('test.csv').set_index('timestamp').dropna()
for l in game['runs'].unique():
    game['sum'][game['runs'] == l] = game[game['runs'] == l]['Period_change'].sum()/game[game['runs'] == l]['Close'].iloc[0]

Upvotes: 0

Views: 55

Answers (1)

BENY
BENY

Reputation: 323226

IIUC, your for loop can be replace by groupby

df['sum']=df.name.map(df.groupby('name').apply(lambda x : x['Period_change'].sum()/x['Close'].iloc[0]))

Upvotes: 1

Related Questions