pymat
pymat

Reputation: 1182

Querying data in Python based on the business day (8am-10pm) and weekdays only

My question is very much open ended and am looking for some advice on how to manipulate an array with the following format:

list:
[datetime.datetime(2016, 5, 17, 11, 32, 25),
datetime.datetime(2016, 5, 17, 11, 33, 25),
datetime.datetime(2016, 5, 17, 11, 34, 25), ...

I'm looking to manipulate the array so that only (1) business days are considered, then (2) only times between 8am and 10pm are considered. Do I use numpy or can pandas achieve this? Can the format of the list i.e. datetime.datetime be used or is this to be converted into another format?

I have tried splitting the epoch so that the date and time are separate. In doing so the following command:

myArray.ix[pd.to_datetime(myArray['Epoch']).isin(pd.bdate_range(start='2016-05-16', end='2016-06-15')

The Epoch format was date and time, so I just split this so that the myArray consists of now 'Date', 'Epoch', 'Value' as three separate columns. In order to better understand this command (which so far doesn't work) and how to use it, I've used the following debugging lines:

t = pd.bdate_range(start='2016-05-16', end='2016-06-15')
u = myArray['Date']
v = u.isin(t)

The problem is from what I see, the dates in myArray['Date'] with format yyyy-mm-yy are not matching with the dates shown in t, also with the format yyyy-mm-dd. This can be verified when looking at v, whereby the contents of this are all false.

Upvotes: 2

Views: 83

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210842

you can use bdate_range() for selecting only business days (1):

In [96]: d.ix[pd.to_datetime(d.Date.dt.date).isin(pd.bdate_range(start='2016-05-16', end='2016-05-22'))]
Out[96]:
                 Date       val
0 2016-05-17 11:32:25  0.235264
1 2016-05-17 11:33:25  0.755800
2 2016-05-17 11:34:25  0.849591
3 2016-05-20 12:00:25  0.955704

and between_time() for selecting times between ... (2):

In [97]: d.set_index('Date').between_time('11:30','11:34')
Out[97]:
                          val
Date
2016-05-17 11:32:25  0.235264
2016-05-17 11:33:25  0.755800

preserving original index:

In [99]: d.set_index('Date').between_time('11:30','11:34').reset_index()
Out[99]:
                 Date       val
0 2016-05-17 11:32:25  0.235264
1 2016-05-17 11:33:25  0.755800

Sample Data Frame:

In [98]: d
Out[98]:
                 Date       val
0 2016-05-17 11:32:25  0.235264
1 2016-05-17 11:33:25  0.755800
2 2016-05-17 11:34:25  0.849591
3 2016-05-20 12:00:25  0.955704
4 2016-05-21 13:13:13  0.301753

Upvotes: 3

Related Questions