Reputation: 568
I have a dataframe of orders as below, where the column 'Value' represents cash in/out and the 'Date' column reflects when the transaction occurred.
Each transaction is grouped, so that the 'QTY' out, is always succeeded by the 'QTY' 'in', reflected by the sign in the 'QTY' column:
Date Qty Price Value
0 2014-11-18 58 495.775716 -2875499
1 2014-11-24 -58 484.280147 2808824
2 2014-11-26 63 474.138699 -2987073
3 2014-12-31 -63 507.931247 3199966
4 2015-01-05 59 495.923771 -2925950
5 2015-02-05 -59 456.224370 2691723
How can I create two columns, 'n_days' and 'price_diff' that is the difference in days between the two dates of each transaction and the 'Value'?
I have tried:
df['price_diff'] = df['Value'].rolling(2).apply(lambda x: x[0] + x[1])
but receiving a key error for the first observation (0).
Many thanks
Upvotes: 2
Views: 1218
Reputation: 150745
Why don't you just use sum
:
df['price_diff'] = df['Value'].rolling(2).sum()
Although from the name, it looks like
df['price_diff'] = df['Price'].diff()
And, for the two columns:
df[['Date_diff','Price_diff']] = df[['Date','Price']].diff()
Output:
Date Qty Price Value Date_diff Price_diff
0 2014-11-18 58 495.775716 -2875499 NaT NaN
1 2014-11-24 -58 484.280147 2808824 6 days -11.495569
2 2014-11-26 63 474.138699 -2987073 2 days -10.141448
3 2014-12-31 -63 507.931247 3199966 35 days 33.792548
4 2015-01-05 59 495.923771 -2925950 5 days -12.007476
5 2015-02-05 -59 456.224370 2691723 31 days -39.699401
Updated Per comment, you can try:
df['Val_sum'] = df['Value'].rolling(2).sum()[1::2]
Output:
Date Qty Price Value Val_sum
0 2014-11-18 58 495.775716 -2875499 NaN
1 2014-11-24 -58 484.280147 2808824 -66675.0
2 2014-11-26 63 474.138699 -2987073 NaN
3 2014-12-31 -63 507.931247 3199966 212893.0
4 2015-01-05 59 495.923771 -2925950 NaN
5 2015-02-05 -59 456.224370 2691723 -234227.0
Upvotes: 1