Reputation: 4718
I have data which is of the form Thu Jun 22 09:43:06
and I would like to infer the year from this to use datetime
to calculate the time between two dates. Is there a way to use datetime
to infer the year given the above data?
Upvotes: 3
Views: 311
Reputation: 2690
You could also use pd.date_range
to generate a lookup table
calendar = pd.date_range('2017-01-01', '2020-12-31')
dow = {i: d for i, d in enumerate(('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))}
moy = {i: d for i, d in enumerate(('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'), 1)}
lup = {'{} {} {:>2d}'.format(dow[d.weekday()], moy[d.month], d.day): str(d.year) for d in calendar}
date = 'Tue Jun 25'
print(lup[date])
# 2019
print(pd.Timestamp(date + ' ' + lup[date]))
# 2019-06-25 00:00:00
Benchmarking it in ipython, there's some decent speedup once the table is generated, but the overhead of generating the table may not be worth it unless you have a lot of dates to confirm.
In [28]: lup = gen_lookup('1-1-2010', '12-31-2017')
In [29]: date = 'Thu Jun 22'
In [30]: lup[date]
Out[30]: ['2017']
In [32]: list(find_year(2010, 2017, 6, 22, 3))
Out[32]: [2017]
In [33]: %timeit lup = gen_lookup('1-1-2010', '12-31-2017')
13.8 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [34]: %timeit yr = lup[date]
54.1 ns ± 0.547 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [35]: %timeit yr = find_year(2010, 2017, 6, 22, 3)
248 ns ± 3.61 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Upvotes: 1
Reputation: 12927
No, but if you know the range (for example 2010..2017), you can just iterate over years to see if Jun 22 falls on Thursday:
def find_year(start_year, end_year, month, day, week_day):
for y in range(start_year, end_year+1):
if datetime.datetime(y, month, day, 0, 0).weekday() == week_day:
yield y
# weekday is 0..6 starting from Monday, so 3 stands for Thursday
print(list(find_year(2010, 2017, 6, 22, 3)))
[2017]
For longer ranges, though, there might be more than one result:
print(list(find_year(2000,2017, 6, 22, 3)))
[2000, 2006, 2017]
Upvotes: 3