TheBeardedBerry
TheBeardedBerry

Reputation: 1764

Using python to create an average out of a list of times

I have a huge list of times (HH:MM:SS) and I know that if I wanted to create an average I could separate the Hours, Seconds, and Minutes and average each one and then concatenate them back together. However I feel that there must be a better way to do that. Does anyone know of a better way to do this?

Thanks!

Upvotes: 12

Views: 30999

Answers (7)

polimath
polimath

Reputation: 301

There's a problem with converting to seconds since midnight and averaging. If you do that with 23:50 and 00:10 you get 12:00 when what you want it 00:00.

A better approach is to average the angles.

import datetime
import math
import numpy

def datetime_to_radians(x):
    # radians are calculated using a 24-hour circle, not 12-hour, starting at north and moving clockwise
    time_of_day = x.time()
    seconds_from_midnight = 3600 * time_of_day.hour + 60 * time_of_day.minute + time_of_day.second
    radians = float(seconds_from_midnight) / float(12 * 60 * 60) * 2.0 * math.pi
    return radians

def average_angle(angles):
    # angles measured in radians
    x_sum = numpy.sum([math.sin(x) for x in angles])
    y_sum = numpy.sum([math.cos(x) for x in angles])
    x_mean = x_sum / float(len(angles))
    y_mean = y_sum / float(len(angles))
    return numpy.arctan2(x_mean, y_mean)

def radians_to_time_of_day(x):
    # radians are measured clockwise from north and represent time in a 24-hour circle
    seconds_from_midnight = int(float(x) / (2.0 * math.pi) * 12.0 * 60.0 * 60.0)
    hour = seconds_from_midnight // 3600
    minute = (seconds_from_midnight % 3600) // 60
    second = seconds_from_midnight % 60
    return datetime.time(hour, minute, second)

def average_times_of_day(x):
    # input datetime.datetime array and output datetime.time value
    angles = [datetime_to_radians(y) for y in x]
    avg_angle = average_angle(angles)
    return radians_to_time_of_day(avg_angle)

average_times_of_day([datetime.datetime(2017, 6, 9, 0, 10), datetime.datetime(2017, 6, 9, 0, 20)])
# datetime.time(0, 15)

average_times_of_day([datetime.datetime(2017, 6, 9, 23, 50), datetime.datetime(2017, 6, 9, 0, 10)])
# datetime.time(0, 0)

Upvotes: 18

Juli
Juli

Reputation: 1051

There might be an alternative method to the great answers already contributed, but it is case-specific. For example if you are interested in averaging the time of day people go to bed, which are times that would normally fall some time between 6 pm and 6 am, you can first transform hour and minutes into a decimal so that 12:30 = 12.5, after that you just need to add 24 to the range of times that throw off estimating the average. For the sleep case that would be taking the times between 0:00 and 6:00 AM which become 24.0 and 30. Now you can estimate the average as you would normally do. Finally, you just need to subtract again 24 if the average is a number higher than 24 and you are done:

def hourtoDec(data):
    '''
    Transforms the hour string values in the list data
    to decimal. The format assumed is HH:mm.
    Values are transformed to float
    For example for 5:30pm the equivalent is 17.5 
    This funtion preserves NaN values
    '''
    dataOutput=[]
    for i in data:
        if not(pd.isnull(i)):
            if type(i)==type("a"):
                    h,m=i.split(':')
                    h=int(h)
                    m=int(m)
                    dataOutput.append(h+m/60.0)
            if isinstance(i, (np.float, float)):
                    dataOutput.append(i)
        else:
            dataOutput.append(i)
    return dataOutput



timestr=pd.DataFrame([ "2020-04-26T23:00:30.000", 
                      "2020-04-25T22:00:30.000", 
                      "2020-04-24T01:00:30.000", 
                      "2020-04-23T02:00:30.000"],columns=["timestamp"])
hours=timestr['timestamp'].apply(lambda x: ":".join(x.split("T")[1].split(":")[0:2]))
hoursDec=hourtoDec(hours)

times2=[]
for i in hoursDec:
    if i>=0 and i<6:
        times2.append(i+24)
    else:
        times2.append(i)

average=np.mean(times2)
if average>=24:
    average=average-24
print(average)

Upvotes: 2

LaSul
LaSul

Reputation: 2411

You need to convert it to complex numbers, take the argument and then average the degrees.

Finally you'll need to parse date to get what you want and then convert back to the original hour.

from cmath import rect, phase
from math import radians, degrees

def meanAngle(deg):
    complexDegree = sum(rect(1, radians(d)) for d in deg) / len(deg)
    argument = phase(complexDegree)
    meanAngle = degrees(argument)
    return meanAngle

def meanTime(times):
    t = (time.split(':') for time in times)
    seconds = ((float(s) + int(m) * 60 + int(h) * 3600) 
               for h, m, s in t)
    day = 24 * 60 * 60
    toAngles = [s * 360. / day for s in seconds]
    meanAsAngle = meanAngle(toAngles)
    meanSeconds = meanAsAngle * day / 360.
    if meanSeconds < 0:
        meanSeconds += day
    h, m = divmod(meanSeconds, 3600)
    m, s = divmod(m, 60)
    return('%02i:%02i:%02i' % (h, m, s))

print(meanTime(["15:00:00", "21:00:00"]))
# 18:00:00
print(meanTime(["23:00:00", "01:00:00"]))
# 00:00:00

Upvotes: 2

tsundoku
tsundoku

Reputation: 1350

Here is one possible implementation of the answer by @eumiro, but this logic only works if these are durations, not times, as pointed out by @lazyr:

from datetime import timedelta

times = ['00:58:00','00:59:00','01:00:00','01:01:00','01:02:00']

print(str(timedelta(seconds=sum(map(lambda f: int(f[0])*3600 + int(f[1])*60 + int(f[2]), map(lambda f: f.split(':'), times)))/len(times))))

Also thanks to a post by @SilentGhost, and a post by @Herms

Upvotes: 8

zenpoy
zenpoy

Reputation: 20126

First parse the time from string format to time struct using strptime, then convert the time to seconds from epoch using mktime, then you should add all the seconds and divide by the number of times, and to convert back to time struct using localtime

Here is an example:

import time


a = time.strptime("2000:11:12:13","%Y:%H:%M:%S")
b = time.strptime("2000:11:14:13","%Y:%H:%M:%S")

avg_time = time.localtime(((time.mktime(a)+time.mktime(b))/2))

>> time.struct_time(tm_year=2000, tm_mon=1, tm_mday=1, tm_hour=11, tm_min=13, tm_sec=13, tm_wday=5, tm_yday=1, tm_isdst=0)

Note that I added the year 2000 because mktime is giving OverflowError for the default year 1900

Upvotes: 2

bchurchill
bchurchill

Reputation: 1420

I think the best thing to do is to convert all those values to a number of seconds and average the whole list. I'll assume that these times are strings in mylist.

 time_list = map(lambda s: int(s[6:8]) + 60*(int(s[3:5]) + 60*int(s[0:2])), mylist)
 average = sum(time_list)/len(time_list)
 bigmins, secs = divmod(average, 60)
 hours, mins = divmod(bigmins, 60)
 print "%02d:%02d:%02d" % (hours, mins, secs)

This is essentially what eumiro recommended. The first line computes the number of seconds for each string. The second line averages them. The next two lines figures out the number of seconds/minutes/hours, and the third line formats the output nicely.

Upvotes: 1

eumiro
eumiro

Reputation: 212835

You don't want to "average" times on hours, minutes and seconds this way:

00:59:00
01:01:00

average clearly to 01:00:00, but not with the logic you presented.

Instead convert all your time intervals into seconds, calculate the average and convert back to HH:MM:SS.

00:59:00 -> 3540 seconds
01:01:00 -> 3660 seconds
            ============
average:    3600 seconds converted to HH:MM:SS -> 01:00:00

Upvotes: 9

Related Questions