Mustard Tiger
Mustard Tiger

Reputation: 3671

Index pandas dataframe by time difference

I have two columns and one is time and the other is volume. I need to subtract the volume from each point with the volume from exactly 24hrs ago.

What is the best way to go about doing this in pandas?

What if there are holes in the data and there is no point in the data frame from exactly 24hrs ago, can I set a rule so that it will index the nearest point to 24hrs ago?

sample df:

In [1]: df
Out[1]: 
Volume   Time

10       24/12/2017 18:40 
27       24/12/2017 18:41
-19      24/12/2017 18:42
7        24/12/2017 18:43
-23      24/12/2017 18:44
18       24/12/2017 18:45
4        24/12/2017 18:46

Upvotes: 1

Views: 69

Answers (1)

Tai
Tai

Reputation: 7994

You can actually create another data frame where you shift the time one day ahead.

Suppose data is

    Volume  Time
0   10      2017-12-24 18:40:00
1   27      2017-12-24 18:41:00
2   -19     2017-12-24 18:42:00
3   7       2017-12-24 18:43:00
4   -23     2017-12-24 18:44:00
5   18      2017-12-24 18:45:00
6   4       2017-12-24 18:46:00
7   127     2017-12-23 18:41:00
8   39      2017-12-23 18:42:00
9   8       2017-12-23 18:45:00
10  40      2017-12-23 18:46:00

df2 = df.copy()
df2["Time"] = df['Time'] - pd.DateOffset(1)

Then, you do inner merge.

df2.merge(df, on="Time")

    Volume_x    Time                    Volume_y
0   27          2017-12-23 18:41:00     127
1   -19         2017-12-23 18:42:00     39
2   18          2017-12-23 18:45:00     8
3   4           2017-12-23 18:46:00     40

Volumne_x is actually the volume one day after the Time column. That is if Time = 2017-12-23 18:41:00, Volume_x is actually the volumn at Time = 2017-12-24 18:41:00. Now, you can do the subtraction.


For people who might want to experiment on the data.

{'Time': {0: Timestamp('2017-12-24 18:40:00'),
  1: Timestamp('2017-12-24 18:41:00'),
  2: Timestamp('2017-12-24 18:42:00'),
  3: Timestamp('2017-12-24 18:43:00'),
  4: Timestamp('2017-12-24 18:44:00'),
  5: Timestamp('2017-12-24 18:45:00'),
  6: Timestamp('2017-12-24 18:46:00'),
  7: Timestamp('2017-12-23 18:41:00'),
  8: Timestamp('2017-12-23 18:42:00'),
  9: Timestamp('2017-12-23 18:45:00'),
  10: Timestamp('2017-12-23 18:46:00')},
 'Volume': {0: 10,
  1: 27,
  2: -19,
  3: 7,
  4: -23,
  5: 18,
  6: 4,
  7: 127,
  8: 39,
  9: 8,
  10: 40}}

Upvotes: 1

Related Questions