Reputation: 189
I'm very new to python and having trouble adjusting the results of an api request to handle UK daylight savings (Greenwich Mean Time / British Summer Time). When reading the Dark Sky documents they state :
The timezone is only used for determining the time of the request; the response will always be relative to the local time zone
I have built the below code to return historic weather information based upon hourly min/max/avg temperatures for each day for specific weather stations. I've been able to turn the returned UNIX time into a time stamp, but I need a way to deal with the data when the clocks change.
This is my whole code, if anyone can offer any advice I would be very grateful
import requests
import json
import pytemperature
import pandas as pd
pd.options.mode.chained_assignment = None # default='warn'
from pandas.io.json import json_normalize
import datetime
#Import CSV with station data
data_csv = pd.read_csv("Degree Days Stations.csv")
#Basic Information for the API request
URL = "https://api.darksky.net/forecast/"
AUTH = "REDACTED/"
EXCLUDES = "?exclude=currently,daily,minutely,alerts,flags"
#Use ZIP function to loop through the station_df dataframe
for STATION, NAME, CODE, LAT, LON in zip(data_csv['Station'], data_csv['Name'], data_csv['Code'], data_csv['Lat'], data_csv['Lon']):
#Result is based upon the time zone when running the script
date1 = '2019-10-26T00:00:00'
date2 = '2019-10-29T00:00:00'
start = datetime.datetime.strptime(date1, '%Y-%m-%dT%H:%M:%S')
end = datetime.datetime.strptime(date2, '%Y-%m-%dT%H:%M:%S')
step = datetime.timedelta(days=1)
while start <= end:
#Compile the Daily Values
DATE = (datetime.date.strftime(start.date(), "%Y-%m-%dT%H:%M:%S"))
#build the api url request
response = requests.get(URL+AUTH+str(LAT)+","+str(LON)+","+str(DATE)+EXCLUDES+"?units=si")
json_data = response.json()
#Flatten the data
json_df = json_normalize(json_data['hourly'],record_path=['data'],sep="_")
#Extract to new df
output_df = json_df[['time','temperature']]
#insert station name to dataframe for my debugging
output_df.insert(0, 'Name', NAME)
#Convert UNIX date to datetime
output_df['time'] = pd.to_datetime(output_df['time'], unit = 's')
##################
# Deal with timezone of output_df['time'] here
##################
#Convert temperatures from oF to oC
output_df['temperature'] = pytemperature.f2c(output_df['temperature'])
#Find the MIN/MAX/AVG from the hourly data for this day
MIN = output_df['temperature'].min()
MAX = output_df['temperature'].max()
AVG = output_df['temperature'].mean()
#Build the POST query
knackURL = "https://api.knack.com/v1/objects/object_34/records"
payload = '{ \
"field_647": "' + STATION + '", \
"field_649": "' + NAME + '", \
"field_650": "' + str(CODE) + '", \
"field_651": "' + str(DATE) + '", \
"field_652": "' + str(MIN) + '", \
"field_653": "' + str(MAX) + '", \
"field_655": "' + str(AVG) + '" \
}'
knackHEADERS = {
'X-Knack-Application-Id': "REDCATED",
'X-Knack-REST-API-Key': "REDACTED",
'Content-Type': "application/json"
}
#response = requests.request("POST", knackURL, data=payload, headers=knackHEADERS)
start += step
My results for October 27th (BST) and 28th (GMT) are shown below and are relevant to the current timezone (GMT). How can I ensure that that I get the same size of dataset starting from 00:00:00 ?
I've looked at arrow and pytz but can't seem to get it to work in the context of the dataframe. I've been working on the assumption that I need to test and deal withe the data when converting it from Unix... but I just can't get it right. Even trying to pass the data through arrow.get eg
d1 = arrow.get(output_df['time'])
print(d1)
Leaves me with the error : "Can't parse single argument type of '{}'".format(type(arg)) TypeError: Can't parse single argument type of ''
So I'm guessing that it doesn't want to work as part of a dataframe ? Thank you in advance
Upvotes: 2
Views: 320