user3009782
user3009782

Reputation: 41

python pandas selecting values based on age

I'm trying to figure out how to do this in Pandas and have zero luck so far.

My data frame looks like this:

                        A           B        C          D
time                
2013-07-19 14:54:03     1354.85     92.20   1453.44     7746.56
2013-07-19 14:56:02     1348.30     44.60   1399.83     7800.17
2013-07-19 14:58:02     1285.76     33.93   1325.31     7874.69
...
2013-12-16 14:24:02     1114.74     136.59  1254.04     7945.96
2013-12-16 14:26:03     1180.76     65.39   1248.59     7951.41
2013-12-16 14:28:03     1015.98     126.96  1147.68     8052.32

This data gets updated very frequently and I would like to be able to select all values in the last 24 hours, or last week, or last month, etc.

My current workaround is to pull the data from a database using a query such as this:

24_hour_data = ('select time, A, B, C, D from \
     agg where time >= datetime(\'now\', \'-24 Hours\', \'localtime\')')

Thanks.

Upvotes: 4

Views: 322

Answers (1)

TomAugspurger
TomAugspurger

Reputation: 28946

Assuming that you don't have to deal with timezones:

import datetime

now = datetime.datetime.now()
yesterday = now + datetime.timedelta(days=-1)
fmt = '%Y-%m-%d'  #     fmt = '%Y-%m-%d %T' if you want more precision.

df.ix[yesterday.strftime(fmt):now.strftime(fmt)]

With your example:

In [17]: now = pd.datetime(2013, 7, 20)  # since thats when the data is from

In [18]: yesterday = now + datetime.timedelta(days=-1)

In [19]: df.ix[yesterday.strftime(fmt):now.strftime(fmt)]
Out[19]: 
                         B        C        D
time                                        
2013-07-19 14:54:03  92.20  1453.44  7746.56
2013-07-19 14:56:02  44.60  1399.83  7800.17
2013-07-19 14:58:02  33.93  1325.31  7874.69

[3 rows x 3 columns]

Also have a look at the arrow library to replace the datetime part. It's fantastic for these sorts of things.

Upvotes: 2

Related Questions