Alexander Thomsen
Alexander Thomsen

Reputation: 469

Convert nested JSON to pandas DataFrame

I have a nested JSON, where I want to extact part of it and make it into a pandas DataFrame. It just ain't working whatever stackoverflow post I look through!

I tried different ways using already existing posts but can't get it working

JSON from "result"

 b'{"coin":{"id":363,"name":"Bitcoin","code":"BTC"},"dataType":"marketCap","baseCurrency":"USD","data":[{"date":"2018-01-12","marketCap":"232547809668.32000000"},{"date":"2018-01-13","marketCap":"241311607656.32000000"}

CODE:

http = urllib3.PoolManager()
url = 'https://www.cryptocurrencychart.com/api/coin/history/363/2018-01-12/2019-01-12/marketCap/USD'
headers = urllib3.util.make_headers(basic_auth='xxx:xxx')
r = http.request('GET', url , headers = headers) 
result = r.data

df = json_normalize(result['data'])
df.set_index('date', inplace = True)

Result is type: bytes

ERROR:

TypeError: byte indices must be integers or slices, not str

Expected DataFrame

            BTCmarketCap
2019-01-01  xxxxxxx 
2019-01-02  xxxxxx

Upvotes: 4

Views: 13355

Answers (1)

cs95
cs95

Reputation: 402513

To unpack the dictionary, use json_normalize with a record_path=... argument.

import pandas.io.json as pd_json

data = pd_json.loads(result)
pd_json.json_normalize(data, record_path='data')

         date              marketCap
0  2018-01-12  232547809668.32000000
1  2018-01-13  241311607656.32000000

If you want the other values as well, pass a meta=.... argument:

df = pd_json.json_normalize(data, 
                            record_path='data', 
                            meta=['coin', 'dataType', 'baseCurrency'])
df

         date              marketCap     ...        dataType baseCurrency
0  2018-01-12  232547809668.32000000     ...       marketCap          USD
1  2018-01-13  241311607656.32000000     ...       marketCap          USD

 df.columns
# Index(['date', 'marketCap', 'coin', 'dataType', 'baseCurrency'], dtype='object')

Upvotes: 5

Related Questions