Reputation: 13800
I'm confused by the interoperation between numpy and pandas date objects (or maybe just by numpy's datetime64 in general).
I was trying to count business days using numpy's built-in functionality like so:
np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64("28/03/2016")])
However, numpy apparently can't deal with the inverted date format:
ValueError: Error parsing datetime string "28/03/2016" at position 2
To get around this, I thought I'd just use pandas to_datetime, which can. However:
np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.to_datetime("28/03/2016"))])
ValueError: Cannot safely convert provided holidays input into an array of dates
Searching around for a bit, it seemed that this was caused by the fact that the chaining of to_datetime and np.datetime64 results in a datetime64[us]
object, which apparently the busday_count
function cannot accept (is this intended behaviour or a bug?). Thus, my next attempt was:
np.busday_count("2016-03-01", "2016-03-31", holidays=[np.datetime64(pd.Timestamp("28"), "D")])
But:
TypeError: Cannot cast datetime.datetime object from metadata [us] to [D] according to the rule 'same_kind'
And that's me out - why are there so many incompatibilities between all these datetime formats? And how can I get around them?
Upvotes: 6
Views: 7578
Reputation: 332
I've been having a similar issue, using np.is_busday()
The type of datetime64 is vital to get right. Checking the numpy datetime docs, you can specify the numpy datetime type to be D.
This works:
my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64[D]')
day_flags['business_day'] = np.is_busday(days,holidays=my_holidays)
Whereas this throws the same error you got:
my_holidays=np.array([datetime.datetime.strptime(x,'%m/%d/%y') for x in holidays.Date.values], dtype='datetime64')
The only difference is specifying the type of datetime64.
dtype='datetime64[D]'
vs
dtype='datetime64'
Docs are here:
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.datetime.html
Upvotes: 4
Reputation: 8521
I had the same issue while using np.busday_count
, later I figured out the problem was with the hours
, minutes
, seconds
, and milliseconds
getting added while converting it to datetime
object or numpy datetime
object.
I just converted to datetime
object with only date and not the hours
, minutes
, seconds
, and milliseconds
.
The following was my code:
holidays_list.json
file:
{
"holidays_2019": [
"04-Mar-2019",
"21-Mar-2019",
"17-Apr-2019",
"19-Apr-2019",
"29-Apr-2019",
"01-May-2019",
"05-Jun-2019",
"12-Aug-2019",
"15-Aug-2019",
"02-Sep-2019",
"10-Sep-2019",
"02-Oct-2019",
"08-Oct-2019",
"28-Oct-2019",
"12-Nov-2019",
"25-Dec-2019"
],
"format": "%d-%b-%Y"
}
code file
:
import json
import datetime
import numpy as np
with open('holidays_list.json', 'r') as infile:
data = json.loads(infile.read())
# the following is where I convert the datetime object to date
holidays = list(map(lambda x: datetime.datetime.strptime(
x, data['format']).date(), data['holidays_2019']))
start_date = datetime.datetime.today().date()
end_date = start_date + datetime.timedelta(days=30)
holidays = [start_date + datetime.timedelta(days=1)]
print(np.busday_count(start_date, end_date, holidays=holidays))
Upvotes: 0