Reputation: 3671
I have two columns and one is time and the other is volume. I need to subtract the volume from each point with the volume from exactly 24hrs ago.
What is the best way to go about doing this in pandas?
What if there are holes in the data and there is no point in the data frame from exactly 24hrs ago, can I set a rule so that it will index the nearest point to 24hrs ago?
sample df:
In [1]: df
Out[1]:
Volume Time
10 24/12/2017 18:40
27 24/12/2017 18:41
-19 24/12/2017 18:42
7 24/12/2017 18:43
-23 24/12/2017 18:44
18 24/12/2017 18:45
4 24/12/2017 18:46
Upvotes: 1
Views: 69
Reputation: 7994
You can actually create another data frame where you shift the time one day ahead.
Suppose data is
Volume Time
0 10 2017-12-24 18:40:00
1 27 2017-12-24 18:41:00
2 -19 2017-12-24 18:42:00
3 7 2017-12-24 18:43:00
4 -23 2017-12-24 18:44:00
5 18 2017-12-24 18:45:00
6 4 2017-12-24 18:46:00
7 127 2017-12-23 18:41:00
8 39 2017-12-23 18:42:00
9 8 2017-12-23 18:45:00
10 40 2017-12-23 18:46:00
df2 = df.copy()
df2["Time"] = df['Time'] - pd.DateOffset(1)
Then, you do inner merge.
df2.merge(df, on="Time")
Volume_x Time Volume_y
0 27 2017-12-23 18:41:00 127
1 -19 2017-12-23 18:42:00 39
2 18 2017-12-23 18:45:00 8
3 4 2017-12-23 18:46:00 40
Volumne_x
is actually the volume one day after the Time
column. That is if Time = 2017-12-23 18:41:00
, Volume_x
is actually the volumn at Time = 2017-12-24 18:41:00
. Now, you can do the subtraction.
For people who might want to experiment on the data.
{'Time': {0: Timestamp('2017-12-24 18:40:00'),
1: Timestamp('2017-12-24 18:41:00'),
2: Timestamp('2017-12-24 18:42:00'),
3: Timestamp('2017-12-24 18:43:00'),
4: Timestamp('2017-12-24 18:44:00'),
5: Timestamp('2017-12-24 18:45:00'),
6: Timestamp('2017-12-24 18:46:00'),
7: Timestamp('2017-12-23 18:41:00'),
8: Timestamp('2017-12-23 18:42:00'),
9: Timestamp('2017-12-23 18:45:00'),
10: Timestamp('2017-12-23 18:46:00')},
'Volume': {0: 10,
1: 27,
2: -19,
3: 7,
4: -23,
5: 18,
6: 4,
7: 127,
8: 39,
9: 8,
10: 40}}
Upvotes: 1