Reputation: 2257
I am looking for ways to split a list of datetime instance by hour for instance. For example:
list_of_dts = [
datetime.datetime(2012,1,1,0,0,0),
datetime.datetime(2012,1,1,0,1,0),
datetime.datetime(2012,1,1,1,8,0),
datetime.datetime(2012,1,2,0,5,0),
datetime.datetime(2012,1,2,1,4,0),
]
would generate
[[datetime.datetime(2012,1,1,0,0,0), datetime.datetime(2012,1,1,0,1,0),
datetime.datetime(2012,1,2,0,5,0)],
[datetime.datetime(2012,1,1,1,8,0), datetime.datetime(2012,1,2,1,4,0)]]
I know you can split datetime by days by quantifying each day to toordinal, but I couldn't find a function that can quantify datetime by hours
[list(group) for k, group in itertools.groupby(list_of_dts,
key=datetime.datetime.toordinal)]
Upvotes: 0
Views: 1617
Reputation: 1121894
Just extract the aspect you want to group on; if you want to group by the hour than there is an attribute to extract:
from itertools import groupby
[list(g) for k, g in groupby(list_of_dts, key=lambda d: d.hour)]
or using operator.attrgetter()
instead of a lambda:
from itertools import groupby
from operator import attrgetter
[list(g) for k, g in groupby(list_of_dts, key=attrgetter('hour'))]
Do take into account that groupby()
doesn't sort; it'll produce groups of consecutive values with the same grouping key only.
If you need to group unsorted values, then you are better off with grouping in a dictionary:
grouped = {}
for dt in list_of_dts:
grouped.setdefault(dt.hour, []).append(dt)
result = grouped.values()
or, sorting the output by hour:
result = [grouped[hour] for hour in sorted(grouped)]
Demo:
>>> import datetime
>>> from itertools import groupby
>>> from operator import attrgetter
>>> list_of_dts = [
... datetime.datetime(2012,1,1,0,0,0),
... datetime.datetime(2012,1,1,0,1,0),
... datetime.datetime(2012,1,1,1,8,0),
... datetime.datetime(2012,1,2,0,5,0),
... datetime.datetime(2012,1,2,1,4,0),
... ]
>>> [list(g) for k, g in groupby(list_of_dts, key=attrgetter('hour'))]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 1)], [datetime.datetime(2012, 1, 1, 1, 8)], [datetime.datetime(2012, 1, 2, 0, 5)], [datetime.datetime(2012, 1, 2, 1, 4)]]
>>> grouped = {}
>>> for dt in list_of_dts:
... grouped.setdefault(dt.hour, []).append(dt)
...
>>> [grouped[hour] for hour in sorted(grouped)]
[[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 1), datetime.datetime(2012, 1, 2, 0, 5)], [datetime.datetime(2012, 1, 1, 1, 8), datetime.datetime(2012, 1, 2, 1, 4)]]
>>> from pprint import pprint
>>> pprint(_)
[[datetime.datetime(2012, 1, 1, 0, 0),
datetime.datetime(2012, 1, 1, 0, 1),
datetime.datetime(2012, 1, 2, 0, 5)],
[datetime.datetime(2012, 1, 1, 1, 8), datetime.datetime(2012, 1, 2, 1, 4)]]
Upvotes: 2