Reputation: 79
I am trying to develop an application to retrieve stock prices (in JSON) and then do some analysis on them. My problem is with getting the JSON response into a pandas DataFrame where I can work. Here is my code:
'''
References
http://stackoverflow.com/questions/6862770/python-3-let-json-object- accept-bytes-or-let-urlopen-output-strings
'''
import json
import pandas as pd
from urllib.request import urlopen
#set API call
url = "https://www.quandl.com/api/v3/datasets/WIKI/AAPL.json?start_date=2017-01-01&end_date=2017-01-31"
#make call and receive response
response = urlopen(url).read().decode('utf8')
dataresponse = json.loads(response)
#check incoming
#print(dataresponse)
df = pd.read_json(dataresponse)
print(df)
The application errors at df = pd.read_json...
with error TypeError: Expected String or Unicode.
So I reckon this is the first hurdle.
The second is getting where I need to. The JSON response contains only two arrays I am interested in, column_names
and data
. How do I extract only these two and put into a pandas DataFrame?
Upvotes: 1
Views: 2377
Reputation: 16251
To answer your first question, pd.read_json
takes a JSON string directly, so you should be doing this:
pd.read_json(response)
But instead, considering how the data is structured, it's best to first convert the JSON string to a dictionary containing the data:
d = json.loads(response)
Then simply build the dataframe from d['dataset']['data']
and d['dataset']['column_names']
:
pd.DataFrame(data=d['dataset']['data'], columns=d['dataset']['column_names'])
Upvotes: 5