Reputation: 893
I have a large dataset with more than 500 000 date & time stamps that look like this:
date time
2017-06-25 00:31:53.993
2017-06-25 00:32:31.224
2017-06-25 00:33:11.223
2017-06-25 00:33:53.876
2017-06-25 00:34:31.219
2017-06-25 00:35:12.634
How do I round these timestamps off to the nearest second?
My code looks like this:
readcsv = pd.read_csv(filename)
log_date = readcsv.date
log_time = readcsv.time
readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time
timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]
So now I have combined the dates and times into a list of datetime.datetime
objects that looks like this:
datetime.datetime(2017,6,25,00,31,53,993000)
datetime.datetime(2017,6,25,00,32,31,224000)
datetime.datetime(2017,6,25,00,33,11,223000)
datetime.datetime(2017,6,25,00,33,53,876000)
datetime.datetime(2017,6,25,00,34,31,219000)
datetime.datetime(2017,6,25,00,35,12,634000)
Where do I go from here?
The df.timestamp.dt.round('1s')
function doesn't seem to be working?
Also when using .split()
I was having issues when the seconds and minutes exceeded 59
Many thanks
Upvotes: 28
Views: 54980
Reputation: 15225
Here's a simple solution that properly rounds up and down and doesn't use any string hacks:
from datetime import datetime, timedelta
def round_to_secs(dt: datetime) -> datetime:
extra_sec = round(dt.microsecond / 10 ** 6)
return dt.replace(microsecond=0) + timedelta(seconds=extra_sec)
Some examples:
now = datetime.now()
print(now) # 2021-07-26 10:43:54.397538
print(round_to_secs(now)) # 2021-07-26 10:43:54 -- rounded down
now = datetime.now()
print(now) # 2021-07-26 10:44:59.787438
print(round_to_secs(now)) # 2021-07-26 10:45:00 -- rounded up taking into account secs and minutes
Upvotes: 1
Reputation: 2721
Another way of doing this that:
round
import datetime
original = datetime.timedelta(seconds=50, milliseconds=20)
rounded = datetime.timedelta(seconds=round(original.total_seconds()))
Upvotes: 1
Reputation: 3117
Without any extra packages, a datetime object can be rounded to the nearest second with the following simple function:
import datetime as dt
def round_seconds(obj: dt.datetime) -> dt.datetime:
if obj.microsecond >= 500_000:
obj += dt.timedelta(seconds=1)
return obj.replace(microsecond=0)
Upvotes: 32
Reputation: 58
I needed it, so I adjusted @srisaila to work for 60 sec/mins. Horribly complicated style, but basic functions.
def round_seconds(dts):
result = []
for item in dts:
date = item.split()[0]
h, m, s = [item.split()[1].split(':')[0],
item.split()[1].split(':')[1],
str(round(float(item.split()[1].split(':')[-1])))]
if len(s) == 1:
s = '0'+s
if int(s) == 60:
m_tmp = int(m)
m_tmp += 1
m = str(m_tmp)
if(len(m)) == 1:
m = '0'+ m
s = '00'
if m == 60:
h_tmp = int(h)
h_tmp += 1
h = str(h_tmp)
if(len(h)) == 1:
print(h)
h = '0'+ h
m = '00'
result.append(date + ' ' + h + ':' + m + ':' + s)
return result
Upvotes: 0
Reputation: 382
An elegant solution that only requires the standard datetime module.
import datetime
currentimemili = datetime.datetime.now()
currenttimesecs = currentimemili - \
datetime.timedelta(microseconds=currentimemili.microsecond)
print(currenttimesecs)
Upvotes: 0
Reputation: 189
If anyone wants to round a single datetime item off to the nearest second, this one works just fine:
pandas.to_datetime(your_datetime_item).round('1s')
Upvotes: 17
Reputation: 6329
Alternate version of @electrovir 's solution:
import datetime
def roundSeconds(dateTimeObject):
newDateTime = dateTimeObject + datetime.timedelta(seconds=.5)
return newDateTime.replace(microsecond=0)
Upvotes: 5
Reputation: 15718
The question doesn't say how you want to round. Rounding down would often be appropriate for a time function. This is not statistics.
rounded_down_datetime = raw_datetime.replace(microsecond=0)
Upvotes: 19
Reputation: 403278
If you're using pandas, you can just round
the data to the nearest second using dt.round
-
df
timestamp
0 2017-06-25 00:31:53.993
1 2017-06-25 00:32:31.224
2 2017-06-25 00:33:11.223
3 2017-06-25 00:33:53.876
4 2017-06-25 00:34:31.219
5 2017-06-25 00:35:12.634
df.timestamp.dt.round('1s')
0 2017-06-25 00:31:54
1 2017-06-25 00:32:31
2 2017-06-25 00:33:11
3 2017-06-25 00:33:54
4 2017-06-25 00:34:31
5 2017-06-25 00:35:13
Name: timestamp, dtype: datetime64[ns]
If timestamp
isn't a datetime
column, convert it first, using pd.to_datetime
-
df.timestamp = pd.to_datetime(df.timestamp)
Then, dt.round
should work.
Upvotes: 9
Reputation: 19
If you are storing dataset into a file you can do like this:
with open('../dataset.txt') as fp:
line = fp.readline()
cnt = 1
while line:
line = fp.readline()
print "\n" + line.strip()
sec = line[line.rfind(':') + 1:len(line)]
rounded_num = int(round(float(sec)))
print line[0:line.rfind(':') + 1] + str(rounded_num)
print abs(float(sec) - rounded_num)
cnt += 1
If you are storing dataset in a list:
dts = ['2017-06-25 00:31:53.993',
'2017-06-25 00:32:31.224',
'2017-06-25 00:33:11.223',
'2017-06-25 00:33:53.876',
'2017-06-25 00:34:31.219',
'2017-06-25 00:35:12.634']
for i in dts:
line = i
print "\n" + line.strip()
sec = line[line.rfind(':') + 1:len(line)]
rounded_num = int(round(float(sec)))
print line[0:line.rfind(':') + 1] + str(rounded_num)
print abs(float(sec) - rounded_num)
Upvotes: 1
Reputation: 2622
Using for loop
and str.split()
:
dts = ['2017-06-25 00:31:53.993',
'2017-06-25 00:32:31.224',
'2017-06-25 00:33:11.223',
'2017-06-25 00:33:53.876',
'2017-06-25 00:34:31.219',
'2017-06-25 00:35:12.634']
for item in dts:
date = item.split()[0]
h, m, s = [item.split()[1].split(':')[0],
item.split()[1].split(':')[1],
str(round(float(item.split()[1].split(':')[-1])))]
print(date + ' ' + h + ':' + m + ':' + s)
2017-06-25 00:31:54
2017-06-25 00:32:31
2017-06-25 00:33:11
2017-06-25 00:33:54
2017-06-25 00:34:31
2017-06-25 00:35:13
>>>
You could turn that into a function:
def round_seconds(dts):
result = []
for item in dts:
date = item.split()[0]
h, m, s = [item.split()[1].split(':')[0],
item.split()[1].split(':')[1],
str(round(float(item.split()[1].split(':')[-1])))]
result.append(date + ' ' + h + ':' + m + ':' + s)
return result
Testing the function:
dts = ['2017-06-25 00:31:53.993',
'2017-06-25 00:32:31.224',
'2017-06-25 00:33:11.223',
'2017-06-25 00:33:53.876',
'2017-06-25 00:34:31.219',
'2017-06-25 00:35:12.634']
from pprint import pprint
pprint(round_seconds(dts))
['2017-06-25 00:31:54',
'2017-06-25 00:32:31',
'2017-06-25 00:33:11',
'2017-06-25 00:33:54',
'2017-06-25 00:34:31',
'2017-06-25 00:35:13']
>>>
Since you seem to be using Python 2.7, to drop any trailing zeros, you may need to change:
str(round(float(item.split()[1].split(':')[-1])))
to
str(round(float(item.split()[1].split(':')[-1]))).rstrip('0').rstrip('.')
I've just tried the function with Python 2.7 at repl.it and it ran as expected.
Upvotes: 2