Reputation: 1764
I have a huge list of times (HH:MM:SS) and I know that if I wanted to create an average I could separate the Hours, Seconds, and Minutes and average each one and then concatenate them back together. However I feel that there must be a better way to do that. Does anyone know of a better way to do this?
Thanks!
Upvotes: 12
Views: 30999
Reputation: 301
There's a problem with converting to seconds since midnight and averaging. If you do that with 23:50 and 00:10 you get 12:00 when what you want it 00:00.
A better approach is to average the angles.
import datetime
import math
import numpy
def datetime_to_radians(x):
# radians are calculated using a 24-hour circle, not 12-hour, starting at north and moving clockwise
time_of_day = x.time()
seconds_from_midnight = 3600 * time_of_day.hour + 60 * time_of_day.minute + time_of_day.second
radians = float(seconds_from_midnight) / float(12 * 60 * 60) * 2.0 * math.pi
return radians
def average_angle(angles):
# angles measured in radians
x_sum = numpy.sum([math.sin(x) for x in angles])
y_sum = numpy.sum([math.cos(x) for x in angles])
x_mean = x_sum / float(len(angles))
y_mean = y_sum / float(len(angles))
return numpy.arctan2(x_mean, y_mean)
def radians_to_time_of_day(x):
# radians are measured clockwise from north and represent time in a 24-hour circle
seconds_from_midnight = int(float(x) / (2.0 * math.pi) * 12.0 * 60.0 * 60.0)
hour = seconds_from_midnight // 3600
minute = (seconds_from_midnight % 3600) // 60
second = seconds_from_midnight % 60
return datetime.time(hour, minute, second)
def average_times_of_day(x):
# input datetime.datetime array and output datetime.time value
angles = [datetime_to_radians(y) for y in x]
avg_angle = average_angle(angles)
return radians_to_time_of_day(avg_angle)
average_times_of_day([datetime.datetime(2017, 6, 9, 0, 10), datetime.datetime(2017, 6, 9, 0, 20)])
# datetime.time(0, 15)
average_times_of_day([datetime.datetime(2017, 6, 9, 23, 50), datetime.datetime(2017, 6, 9, 0, 10)])
# datetime.time(0, 0)
Upvotes: 18
Reputation: 1051
There might be an alternative method to the great answers already contributed, but it is case-specific. For example if you are interested in averaging the time of day people go to bed, which are times that would normally fall some time between 6 pm and 6 am, you can first transform hour and minutes into a decimal so that 12:30 = 12.5, after that you just need to add 24 to the range of times that throw off estimating the average. For the sleep case that would be taking the times between 0:00 and 6:00 AM which become 24.0 and 30. Now you can estimate the average as you would normally do. Finally, you just need to subtract again 24 if the average is a number higher than 24 and you are done:
def hourtoDec(data):
'''
Transforms the hour string values in the list data
to decimal. The format assumed is HH:mm.
Values are transformed to float
For example for 5:30pm the equivalent is 17.5
This funtion preserves NaN values
'''
dataOutput=[]
for i in data:
if not(pd.isnull(i)):
if type(i)==type("a"):
h,m=i.split(':')
h=int(h)
m=int(m)
dataOutput.append(h+m/60.0)
if isinstance(i, (np.float, float)):
dataOutput.append(i)
else:
dataOutput.append(i)
return dataOutput
timestr=pd.DataFrame([ "2020-04-26T23:00:30.000",
"2020-04-25T22:00:30.000",
"2020-04-24T01:00:30.000",
"2020-04-23T02:00:30.000"],columns=["timestamp"])
hours=timestr['timestamp'].apply(lambda x: ":".join(x.split("T")[1].split(":")[0:2]))
hoursDec=hourtoDec(hours)
times2=[]
for i in hoursDec:
if i>=0 and i<6:
times2.append(i+24)
else:
times2.append(i)
average=np.mean(times2)
if average>=24:
average=average-24
print(average)
Upvotes: 2
Reputation: 2411
You need to convert it to complex numbers, take the argument and then average the degrees.
Finally you'll need to parse date to get what you want and then convert back to the original hour.
from cmath import rect, phase
from math import radians, degrees
def meanAngle(deg):
complexDegree = sum(rect(1, radians(d)) for d in deg) / len(deg)
argument = phase(complexDegree)
meanAngle = degrees(argument)
return meanAngle
def meanTime(times):
t = (time.split(':') for time in times)
seconds = ((float(s) + int(m) * 60 + int(h) * 3600)
for h, m, s in t)
day = 24 * 60 * 60
toAngles = [s * 360. / day for s in seconds]
meanAsAngle = meanAngle(toAngles)
meanSeconds = meanAsAngle * day / 360.
if meanSeconds < 0:
meanSeconds += day
h, m = divmod(meanSeconds, 3600)
m, s = divmod(m, 60)
return('%02i:%02i:%02i' % (h, m, s))
print(meanTime(["15:00:00", "21:00:00"]))
# 18:00:00
print(meanTime(["23:00:00", "01:00:00"]))
# 00:00:00
Upvotes: 2
Reputation: 1350
Here is one possible implementation of the answer by @eumiro, but this logic only works if these are durations, not times, as pointed out by @lazyr:
from datetime import timedelta
times = ['00:58:00','00:59:00','01:00:00','01:01:00','01:02:00']
print(str(timedelta(seconds=sum(map(lambda f: int(f[0])*3600 + int(f[1])*60 + int(f[2]), map(lambda f: f.split(':'), times)))/len(times))))
Also thanks to a post by @SilentGhost, and a post by @Herms
Upvotes: 8
Reputation: 20126
First parse the time from string format to time struct using strptime, then convert the time to seconds from epoch using mktime, then you should add all the seconds and divide by the number of times, and to convert back to time struct using localtime
Here is an example:
import time
a = time.strptime("2000:11:12:13","%Y:%H:%M:%S")
b = time.strptime("2000:11:14:13","%Y:%H:%M:%S")
avg_time = time.localtime(((time.mktime(a)+time.mktime(b))/2))
>> time.struct_time(tm_year=2000, tm_mon=1, tm_mday=1, tm_hour=11, tm_min=13, tm_sec=13, tm_wday=5, tm_yday=1, tm_isdst=0)
Note that I added the year 2000 because mktime
is giving OverflowError
for the default year 1900
Upvotes: 2
Reputation: 1420
I think the best thing to do is to convert all those values to a number of seconds and average the whole list. I'll assume that these times are strings in mylist
.
time_list = map(lambda s: int(s[6:8]) + 60*(int(s[3:5]) + 60*int(s[0:2])), mylist)
average = sum(time_list)/len(time_list)
bigmins, secs = divmod(average, 60)
hours, mins = divmod(bigmins, 60)
print "%02d:%02d:%02d" % (hours, mins, secs)
This is essentially what eumiro recommended. The first line computes the number of seconds for each string. The second line averages them. The next two lines figures out the number of seconds/minutes/hours, and the third line formats the output nicely.
Upvotes: 1
Reputation: 212835
You don't want to "average" times on hours, minutes and seconds this way:
00:59:00
01:01:00
average clearly to 01:00:00
, but not with the logic you presented.
Instead convert all your time intervals into seconds, calculate the average and convert back to HH:MM:SS
.
00:59:00 -> 3540 seconds
01:01:00 -> 3660 seconds
============
average: 3600 seconds converted to HH:MM:SS -> 01:00:00
Upvotes: 9