Kevin
Kevin

Reputation: 2257

Split list of datetime by hours using python

I am looking for ways to split a list of datetime instance by hour for instance. For example:

list_of_dts = [
    datetime.datetime(2012,1,1,0,0,0),
    datetime.datetime(2012,1,1,0,1,0),
    datetime.datetime(2012,1,1,1,8,0),
    datetime.datetime(2012,1,2,0,5,0),
    datetime.datetime(2012,1,2,1,4,0),
]

would generate

[[datetime.datetime(2012,1,1,0,0,0), datetime.datetime(2012,1,1,0,1,0), 
  datetime.datetime(2012,1,2,0,5,0)], 
 [datetime.datetime(2012,1,1,1,8,0), datetime.datetime(2012,1,2,1,4,0)]]

I know you can split datetime by days by quantifying each day to toordinal, but I couldn't find a function that can quantify datetime by hours

[list(group) for k, group in itertools.groupby(list_of_dts,
                                               key=datetime.datetime.toordinal)]

Upvotes: 0

Views: 1617

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121894

Just extract the aspect you want to group on; if you want to group by the hour than there is an attribute to extract:

from itertools import groupby

[list(g) for k, g in groupby(list_of_dts, key=lambda d: d.hour)]

or using operator.attrgetter() instead of a lambda:

from itertools import groupby
from operator import attrgetter

[list(g) for k, g in groupby(list_of_dts, key=attrgetter('hour'))]

Do take into account that groupby() doesn't sort; it'll produce groups of consecutive values with the same grouping key only.

If you need to group unsorted values, then you are better off with grouping in a dictionary:

grouped = {}

for dt in list_of_dts:
    grouped.setdefault(dt.hour, []).append(dt)

result = grouped.values()

or, sorting the output by hour:

result = [grouped[hour] for hour in sorted(grouped)]

Demo:

>>> import datetime
>>> from itertools import groupby
>>> from operator import attrgetter
>>> list_of_dts = [
...     datetime.datetime(2012,1,1,0,0,0),
...     datetime.datetime(2012,1,1,0,1,0),
...     datetime.datetime(2012,1,1,1,8,0),
...     datetime.datetime(2012,1,2,0,5,0),
...     datetime.datetime(2012,1,2,1,4,0),
... ]
>>> [list(g) for k, g in groupby(list_of_dts, key=attrgetter('hour'))]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 1)], [datetime.datetime(2012, 1, 1, 1, 8)], [datetime.datetime(2012, 1, 2, 0, 5)], [datetime.datetime(2012, 1, 2, 1, 4)]]
>>> grouped = {}
>>> for dt in list_of_dts:
...     grouped.setdefault(dt.hour, []).append(dt)
... 
>>> [grouped[hour] for hour in sorted(grouped)]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 1), datetime.datetime(2012, 1, 2, 0, 5)], [datetime.datetime(2012, 1, 1, 1, 8), datetime.datetime(2012, 1, 2, 1, 4)]]
>>> from pprint import pprint
>>> pprint(_)
[[datetime.datetime(2012, 1, 1, 0, 0),
  datetime.datetime(2012, 1, 1, 0, 1),
  datetime.datetime(2012, 1, 2, 0, 5)],
 [datetime.datetime(2012, 1, 1, 1, 8), datetime.datetime(2012, 1, 2, 1, 4)]]

Upvotes: 2

Related Questions