Reputation: 159
I am trying to convert a json returned from the API call into pandas dataframe. Ideally I would like to extract only 'Type','Name' and 'SUPPLY'.
I have tried multiple things, such as flatten()
, json_normalize()
and so on, but couldn't make it to work.
def get_cryptocompare_data():
url = "https://min-api.cryptocompare.com/data/top/mktcapfull?limit=15&tsym=USD"
data = requests.get(url)
d = data.json()
I would like to convert it to pandas dataframe, with columns for 'Type','Name','SUPPLY'.
Upvotes: 0
Views: 1048
Reputation: 28729
Jmespath could help here with nested paths - basic summary is if u encounter a list, represent it with a bracket([]
), if it is a key, access it wih dot notation (.
) :
import requests
url = "https://min-api.cryptocompare.com/data/top/mktcapfull?limit=15&tsym=USD"
data = requests.get(url).json()
#example : access the name key :
#path : Data -> list -> CoinInfo(dict) ->Name(dict)
#representation : Data[].CoinInfo.Name
#in words : Data, then list, then CoinInfo key, then Name key
import jmespath
expression = jmespath.compile('''{name:Data[].CoinInfo.Name,
type:Data[].CoinInfo.Type,
supply:Data[].RAW.USD.SUPPLY}''')
res = expression.search(data)
Read in data
pd.DataFrame(res)
name type supply
0 BTC 1 1.833352e+07
1 ETH 1 1.105811e+08
2 XRP 1 9.999185e+10
3 GAPS 1 2.000000e+09
4 CRO 1 1.000000e+11
5 USDT 1 4.642367e+09
6 BCH 1 1.838416e+07
7 PLF 1 1.000000e+10
8 CTAG 1 4.000000e+09
9 LINK 1 1.000000e+09
Upvotes: 1
Reputation: 1680
The goal here is to flatten data.json()
which is a dict
returned by requests
.
Flattening dictionaries is a reasonable complex topic because of the various types of values that can be found within. But here, the data feed is known.
Therefore, a simply flattening function can be found from various python recipes and functionally applied to the data obtained from the data feed.
import json
import itertools
import requests
import json
import pandas as pd
def get_cryptocompare_data():
url = "https://min-api.cryptocompare.com/data/top/mktcapfull?limit=15&tsym=USD"
response = requests.get(url)
d = response.json()
return d
# https://codereview.stackexchange.com/questions/21033/flatten-dictionary-in-python-functional-style
def flatten_dict(d):
def items():
for key, value in d.items():
if isinstance(value, dict):
for subkey, subvalue in flatten_dict(value).items():
yield key + "." + subkey, subvalue
else:
yield key, value
return dict(items())
d = get_cryptocompare_data()
data = d['Data']
data = list(map(flatten_dict, data))
df = pd.DataFrame(data)
print(df[['CoinInfo.Name', 'CoinInfo.Type', 'RAW.USD.TYPE', 'RAW.USD.SUPPLY', 'DISPLAY.USD.SUPPLY']])
Will provide the following name, type and supply columns
CoinInfo.Name CoinInfo.Type RAW.USD.TYPE RAW.USD.SUPPLY DISPLAY.USD.SUPPLY
0 BTC 1 5 1.764898e+07 Ƀ 17,648,975.0
1 XRP 1 5 9.999185e+10 XRP 99,991,850,794.0
2 ETH 1 5 1.056861e+08 Ξ 105,686,063.1
3 EOS 1 5 1.041886e+09 EOS 1,041,886,454.0
4 BCH 1 5 1.764895e+07 BCH 17,648,946.0
5 LTC 1 5 6.136893e+07 Ł 61,368,933.6
6 BNB 1 5 1.883456e+08 BNB 188,345,602.2
7 USDT 1 5 2.468906e+09 ₮ 2,468,905,774.0
8 XLM 1 5 1.932851e+10 XLM 19,328,512,453.0
9 ADA 1 5 2.592707e+10 ADA 25,927,070,538.0
Upvotes: 0
Reputation: 431
I prefer the old way (for
loop) when dealing with heavily nested JSON like this because it is simple to understand.
import pandas as pd
records = []
for item in d["Data"]:
records.append(
{
"Type": item["CoinInfo"]["Type"],
"Name": item["CoinInfo"]["Name"],
"SUPPLY": item["RAW"]["USD"]["SUPPLY"],
}
)
df = pd.DataFrame.from_records(records)
df.head()
I don't set the Type
as the index because i'm not sure why you want Type
as the index since it only has 1 value. If you need to set Type
as the index, just add df.set_index("Type", inplace=True)
Upvotes: 0