dejanmarich
dejanmarich

Reputation: 1285

Calculate difference between values in rows by group

I have df like this:

enter image description here

I need to calculate difference between X1, X2, X3, X4 and X5 for each company, but I know only how to calculate difference between whole column.

df['dX1'] = df['X1'].shift(-1) - df['X1']
df['dX2'] = df['X2'].shift(-1) - df['X2']
df['dX3'] = df['X3'].shift(-1) - df['X3']
...

It's bad approach because it's substracting X1 on line 74 with X1 on line 73 (and that's 2 different companies, that makes no sense).

My question is, how to calculate difference between row values but for every period and for every company. E.g.

enter image description here

(I can load csvs one by one and calculate diff for each company, then merge all into one list, but it will take 2 days because i have 700 csv files).

Upvotes: 6

Views: 7246

Answers (1)

jezrael
jezrael

Reputation: 862581

I think you need DataFrameGroupBy.diff:

cols = ['X1','X2','X3','X4','X5']
df1 = df.groupby('company')[cols].diff(-1)

Upvotes: 7

Related Questions