Python: datetime64 issues with range

Question

I am trying to have a vector of seconds between two time intervals:

import numpy as np
import pandas as pd    
date="2011-01-10"
start=np.datetime64(date+'T09:30:00')
end=np.datetime64(date+'T16:00:00')
range = pd.date_range(start, end, freq='S')

For some reason when I print range I get:

[2011-01-10 17:30:00, ..., 2011-01-11 00:00:00]

So the length is 23401 which is what I want but definitely not the correct time interval. Why is that?

Also, if I have a DataFrame df with a column of datetime64 format that looks like:

Time
15:59:57.887529007
15:59:57.805383290

Once I solved the problem above, will I be able to do the following:

data = df.reindex(df.Time + range) data = data.ffill() ??

I need to do the exact steps proposed here except with datetime64 format. Is it possible?

Andy Hayden · Accepted Answer

It seems that pandas date_range is dropping the timezone (looks like a bug, I think it's already filed...), you can use Timestamp rather than datetime64 to workaround this:

In [11]: start = pd.Timestamp(date+'T09:30:00')

In [12]: end = pd.Timestamp(date+'T16:00:00')

In [13]: pd.date_range(start, end, freq='S')
Out[13]: 

[2011-01-10 09:30:00, ..., 2011-01-10 16:00:00]
Length: 23401, Freq: S, Timezone: None

Note: To see it's a timezone, you're in UTC-8, and 14:00 + 8:00 == 00:00 (the next day).

Python: datetime64 issues with range

Answers (2)

Related Questions