drjrm3
drjrm3

Reputation: 4718

Infer year from day of week and date with python datetime

I have data which is of the form Thu Jun 22 09:43:06 and I would like to infer the year from this to use datetime to calculate the time between two dates. Is there a way to use datetime to infer the year given the above data?

Upvotes: 3

Views: 311

Answers (2)

Elliot
Elliot

Reputation: 2690

You could also use pd.date_range to generate a lookup table

calendar = pd.date_range('2017-01-01', '2020-12-31')
dow = {i: d for i, d in enumerate(('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun'))}
moy = {i: d for i, d in enumerate(('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'), 1)}
lup = {'{} {} {:>2d}'.format(dow[d.weekday()], moy[d.month], d.day): str(d.year) for d in calendar}
date = 'Tue Jun 25'
print(lup[date])
# 2019
print(pd.Timestamp(date + ' ' + lup[date]))
# 2019-06-25 00:00:00

Benchmarking it in ipython, there's some decent speedup once the table is generated, but the overhead of generating the table may not be worth it unless you have a lot of dates to confirm.

In [28]: lup = gen_lookup('1-1-2010', '12-31-2017')

In [29]: date = 'Thu Jun 22'

In [30]: lup[date]
Out[30]: ['2017']

In [32]: list(find_year(2010, 2017, 6, 22, 3))
Out[32]: [2017]

In [33]: %timeit lup = gen_lookup('1-1-2010', '12-31-2017')
13.8 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [34]: %timeit yr = lup[date]
54.1 ns ± 0.547 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [35]: %timeit yr = find_year(2010, 2017, 6, 22, 3)
248 ns ± 3.61 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Upvotes: 1

Błotosmętek
Błotosmętek

Reputation: 12927

No, but if you know the range (for example 2010..2017), you can just iterate over years to see if Jun 22 falls on Thursday:

def find_year(start_year, end_year, month, day, week_day):
    for y in range(start_year, end_year+1):
        if datetime.datetime(y, month, day, 0, 0).weekday() == week_day:
            yield y
# weekday is 0..6 starting from Monday, so 3 stands for Thursday
print(list(find_year(2010, 2017, 6, 22, 3)))

[2017]

For longer ranges, though, there might be more than one result:

print(list(find_year(2000,2017, 6, 22, 3)))

[2000, 2006, 2017]

Upvotes: 3

Related Questions