CatParky
CatParky

Reputation: 189

How to adjust Dark Sky API response data for daylight savings (GMT/BST)

I'm very new to python and having trouble adjusting the results of an api request to handle UK daylight savings (Greenwich Mean Time / British Summer Time). When reading the Dark Sky documents they state :

The timezone is only used for determining the time of the request; the response will always be relative to the local time zone

I have built the below code to return historic weather information based upon hourly min/max/avg temperatures for each day for specific weather stations. I've been able to turn the returned UNIX time into a time stamp, but I need a way to deal with the data when the clocks change.

This is my whole code, if anyone can offer any advice I would be very grateful

import requests
import json
import pytemperature
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'

from pandas.io.json import json_normalize
import datetime

#Import CSV with station data
data_csv = pd.read_csv("Degree Days Stations.csv") 

#Basic Information for the API request
URL = "https://api.darksky.net/forecast/"
AUTH = "REDACTED/"
EXCLUDES = "?exclude=currently,daily,minutely,alerts,flags"

#Use ZIP function to loop through the station_df dataframe
for STATION, NAME, CODE, LAT, LON in zip(data_csv['Station'], data_csv['Name'], data_csv['Code'], data_csv['Lat'], data_csv['Lon']):

    #Result is based upon the time zone when running the script
    date1 = '2019-10-26T00:00:00'
    date2 = '2019-10-29T00:00:00'

    start = datetime.datetime.strptime(date1, '%Y-%m-%dT%H:%M:%S')
    end = datetime.datetime.strptime(date2, '%Y-%m-%dT%H:%M:%S')
    step = datetime.timedelta(days=1)
    while start <= end:
        #Compile the Daily Values
        DATE =  (datetime.date.strftime(start.date(), "%Y-%m-%dT%H:%M:%S"))

        #build the api url request
        response = requests.get(URL+AUTH+str(LAT)+","+str(LON)+","+str(DATE)+EXCLUDES+"?units=si")
        json_data = response.json()

        #Flatten the data 
        json_df = json_normalize(json_data['hourly'],record_path=['data'],sep="_")
        #Extract to new df
        output_df = json_df[['time','temperature']]

        #insert station name to dataframe for my debugging
        output_df.insert(0, 'Name', NAME)


        #Convert UNIX date to datetime
        output_df['time'] =  pd.to_datetime(output_df['time'], unit = 's')
        ##################
        # Deal with timezone of output_df['time'] here 
        ##################


        #Convert temperatures from oF to oC
        output_df['temperature'] = pytemperature.f2c(output_df['temperature'])
        #Find the MIN/MAX/AVG from the hourly data for this day
        MIN = output_df['temperature'].min()
        MAX = output_df['temperature'].max()
        AVG = output_df['temperature'].mean()

        #Build the POST query
        knackURL = "https://api.knack.com/v1/objects/object_34/records"
        payload = '{ \
                   "field_647": "' + STATION + '", \
                   "field_649": "' + NAME + '", \
                   "field_650": "' + str(CODE) + '", \
                   "field_651": "' + str(DATE) + '", \
                   "field_652": "' + str(MIN) + '", \
                   "field_653": "' + str(MAX) + '", \
                   "field_655": "' + str(AVG) + '" \
                  }'

        knackHEADERS = {
               'X-Knack-Application-Id': "REDCATED",
               'X-Knack-REST-API-Key': "REDACTED",
               'Content-Type': "application/json"
               }
        #response = requests.request("POST", knackURL, data=payload, headers=knackHEADERS)
        start += step

My results for October 27th (BST) and 28th (GMT) are shown below and are relevant to the current timezone (GMT). How can I ensure that that I get the same size of dataset starting from 00:00:00 ?

enter image description here

I've looked at arrow and pytz but can't seem to get it to work in the context of the dataframe. I've been working on the assumption that I need to test and deal withe the data when converting it from Unix... but I just can't get it right. Even trying to pass the data through arrow.get eg

 d1 = arrow.get(output_df['time'])
 print(d1)

Leaves me with the error : "Can't parse single argument type of '{}'".format(type(arg)) TypeError: Can't parse single argument type of ''

So I'm guessing that it doesn't want to work as part of a dataframe ? Thank you in advance

Upvotes: 2

Views: 320

Answers (0)

Related Questions