Reputation: 1210
I am creating a pandas dataframe from historical weather data downloaded from weather underground.
import json
import requests
import pandas as pd
import numpy as np
import datetime
from dateutil.parser import parse
address = "http://api.wunderground.com/api/7036740167876b59/history_20060405/q/CA/San_Francisco.json"
r = requests.get(address)
wu_data = r.json()
Because I do not need all the data I only use the list of observations. This list contains two elements - date and utcdate - that are actually dictionaries.
df = pd.DataFrame.from_dict(wu_data["history"]["observations"])
I would like to index the dataframe I have created with the parsed date from the 'pretty' key within the dictionary. I can access this value by using the array index, but I can't figure out how to do this directly without a loop. For example, for the 23th element I can write
pretty_date = df["date"].values[23]["pretty"]
print pretty_date
time = parse(pretty_date)
print time
And I get
11:56 PM PDT on April 05, 2006
2006-04-05 23:56:00
This is what I am doing at the moment
g = lambda x: parse(x["pretty"])
df_dates = pd.DataFrame.from_dict(df["date"])
df.index = df_date["date"].apply(g)
df is now reindexed. At this point I can remove the columns I do not need.
Is there a more direct way to do this?
Please notice that sometimes there are multiple observations for the same date, but I deal with data cleaning, duplicates, etc. in a different part of the code.
Upvotes: 0
Views: 1146
Reputation: 20553
Since the dtype held in pretty
is just object, you can simply grab them to a list and get indexed. Not sure if this is what you want:
# by the way, `r.json` should be without ()`
wu_data = r.json
df = pd.DataFrame.from_dict(wu_data["history"]["observations"])
# just index using list comprehension, getting "pretty" inside df["date"] object.
df.index = [parse(df["date"][n]["pretty"]) for n in range(len(df))]
df.index
<class 'pandas.tseries.index.DatetimeIndex'>
[2006-04-05 00:56:00, ..., 2006-04-05 23:56:00]
Length: 24, Freq: None, Timezone: None
Hope this helps.
Upvotes: 1