Reputation: 33
This code has been working for me for months and this morning it is throwing the error: TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'
import pandas as pd
import datetime
dt=datetime.datetime.strptime
date_array=[]
for i in range(len(Date)):
date_array.append(dt(Date[i],'%Y-%m-%dT%H:%M:%S%z')) # Data downloaded with obtimezone=local
date_array=n.array(date_array)
# Wire Mountain Dataframe
W_data=pd.DataFrame(data={'Solar':WIRC1},index=date_array)
W_mask=W_data.where(W_data > 0) # Using only daytime data, when solar does not equal 0
W_mean=W_mask.resample('D').mean() #Daily mean
The dataframe looks like this:
Solar
2020-10-25 00:50:00-07:00 0.0
2020-10-25 01:50:00-07:00 0.0
2020-10-25 02:50:00-07:00 0.0
2020-10-25 03:50:00-07:00 0.0
2020-10-25 04:50:00-07:00 0.0
2020-10-25 05:50:00-07:00 0.0
2020-10-25 06:50:00-07:00 0.0
2020-10-25 07:50:00-07:00 2.0
2020-10-25 08:50:00-07:00 49.0
2020-10-25 09:50:00-07:00 116.0
2020-10-25 10:50:00-07:00 155.0
2020-10-25 11:50:00-07:00 233.0
2020-10-25 12:50:00-07:00 363.0
The array I used as an index for the dataframe is python datetime
type(date_array[0])
Out[24]: datetime.datetime
Why did this suddenly stop working? Maybe backend code on Pandas changing? I thought maybe I could change the python datetime index to Pandas using:
date_array=n.array(pd.to_datetime(date_array))
But got:
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
I also tried from another Stack Overflow question:
W_mean=W_mask.set_index(date_array).resample('D').mean()
But I got the same error. Thank you for any help you can provide!
Upvotes: 0
Views: 1208
Reputation: 33
The "something" that changed was the local time- from daylight savings to standard. From this similar issue,
A pandas datetime column also requires the offset to be the same. A column with different offsets, will not be converted to a datetime dtype. I suggest, do not convert the data to a datetime until it's in pandas.
My data had two offsets, as shown below:
Date[0]
Out[34]: '2020-10-25T00:50:00-0700'
Date[-1]
Out[35]: '2020-11-07T22:50:00-0800'
Because of the two different offsets, the dates were not being converted to a datetime dtype.
I pulled the data in UTC instead of local time, then as suggested, I did not convert to datetime until the date column was in Pandas. After adding the conversion to US/Pacific time, Pandas handled the time change seamlessly.
import pandas as pd
Date=n.genfromtxt('WIRC1.txt',delimiter=',',skip_header=8,usecols=1,dtype=str)
W_data=pd.DataFrame(data={'Solar':WIRC1},index=pd.to_datetime(Date).tz_convert('US/Pacific'))
W_mask=W_data.where(W_data > 0) # Using only daytime data, when solar does not equal 0
W_mean=W_mask.resample('D').mean() #Daily mean
Upvotes: 1