Grouping value by time frames

Question

Here is what I am using to group items by time frame when parsing a csv file, it is working fine, but now I would like to create slices by 4h and 30mn, this particular code just works for slices by the hour, I would like to create 4 hours slices ( or 30m slices )

tf = "%d-%b-%Y-%H"

lmb = lambda d: datetime.datetime.strptime(d["Date[G]"]+"-"+d["Time[G]"], "%d-%b-%Y-%H:%M:%S.%f").strftime(tf)

for k, g in itertools.groupby(csvReader, key = lmb):
    for i in g:
        "do something"

Thanks!

ecatmur · Accepted Answer

The general best approach is to have the groupby key return a tuple which groups items into the appropriate bucket.

For example, for 4h slices:

def by_4h(d):
    dt = datetime.datetime.strptime(d["Date[G]"]+"-"+d["Time[G]"], "%d-%b-%Y-%H:%M:%S.%f")
    return (dt.year, dt.month, dt.day, dt.hour // 4)

You now know that if two times are in the same 4 hour slice (starting from midnight) then hour // 4 will give the same result for those times, so you end the tuple there.

Or for 30m slices:

def by_30m(d):
    dt = datetime.datetime.strptime(d["Date[G]"]+"-"+d["Time[G]"], "%d-%b-%Y-%H:%M:%S.%f")
    return (dt.year, dt.month, dt.day, dt.hour, dt.minute // 30)

This is using // integer division for Python 3 compatibility, but it also works in Python 2.x and makes it clear that you want integer division.

Grouping value by time frames

Answers (1)

Related Questions