Philipp Braun
Philipp Braun

Reputation: 1573

Average Date Array Calculation

I would like to get the mean of the following dates. I thought about converting all the data to seconds and then averaging them. But there is probably a better way to do it.

date = ['2016-02-23 09:36:26', '2016-02-24 10:00:32', '2016-02-24 11:28:22', '2016-02-24 11:27:20', '2016-02-24 11:24:15', '2016-02-24 11:20:25', '2016-02-24 11:17:43', '2016-02-24 11:12:03', '2016-02-24 11:09:11', '2016-02-24 11:08:44', '2016-02-24 11:05:28', '2016-02-24 11:03:23', '2016-02-24 10:58:08', '2016-02-24 10:53:59', '2016-02-24 10:49:34', '2016-02-24 10:43:33', '2016-02-24 10:35:27', '2016-02-24 10:31:50', '2016-02-24 10:31:17', '2016-02-24 10:30:05', '2016-02-24 10:29:21']

Nasty solution:

import datetime
import time
import numpy as np

date = ['2016-02-23 09:36:26', '2016-02-24 10:00:32', '2016-02-24 11:28:22', '2016-02-24 11:27:20', '2016-02-24 11:24:15', '2016-02-24 11:20:25', '2016-02-24 11:17:43', '2016-02-24 11:12:03', '2016-02-24 11:09:11', '2016-02-24 11:08:44', '2016-02-24 11:05:28', '2016-02-24 11:03:23', '2016-02-24 10:58:08', '2016-02-24 10:53:59', '2016-02-24 10:49:34', '2016-02-24 10:43:33', '2016-02-24 10:35:27', '2016-02-24 10:31:50', '2016-02-24 10:31:17', '2016-02-24 10:30:05', '2016-02-24 10:29:21']
sec = [time.mktime(datetime.datetime.strptime(d, "%Y-%m-%d %H:%M:%S").timetuple()) for d in date]
mean = datetime.datetime.fromtimestamp(np.mean(sec))
print(mean)

Upvotes: 7

Views: 8052

Answers (1)

unutbu
unutbu

Reputation: 880757

In NumPy all datetime64[s]s are internally represented by 8-byte integers. The ints represent the number of seconds since the Epoch.

So you could convert the date list to a NumPy array of datetime64[s] dtype, view it as dtype i8 (8-byte ints), take the mean, and then convert the int back into a datetime64[s].


import numpy as np

date = ['2016-02-23 09:36:26', '2016-02-24 10:00:32', '2016-02-24 11:28:22', '2016-02-24 11:27:20', '2016-02-24 11:24:15', '2016-02-24 11:20:25', '2016-02-24 11:17:43', '2016-02-24 11:12:03', '2016-02-24 11:09:11', '2016-02-24 11:08:44', '2016-02-24 11:05:28', '2016-02-24 11:03:23', '2016-02-24 10:58:08', '2016-02-24 10:53:59', '2016-02-24 10:49:34', '2016-02-24 10:43:33', '2016-02-24 10:35:27', '2016-02-24 10:31:50', '2016-02-24 10:31:17', '2016-02-24 10:30:05', '2016-02-24 10:29:21']

mean = (np.array(date, dtype='datetime64[s]')
        .view('i8')
        .mean()
        .astype('datetime64[s]'))

print(mean)

prints

2016-02-24T09:43:40-0500

Upvotes: 14

Related Questions