Loop over json data and merge df data

Question

Trying to loop over JSON data and create a DataFrame at the end containing all the JSON data. I'm stuck on the last part of merging/joining the current loop data DF to the DF that will contain all the data. I know I'm getting the data correctly from the JSON.

Code

tickers_list = ['BTC', 'ETH']
#setup 
tsym = "USD"
s=requests.session()
url = "https://min-api.cryptocompare.com/data/v2/histoday?"
api_key = "apikeytest12345"
cols = ['time', 'close', 'high', 'low']

df_cc = pd.DataFrame()

#Request for historical OHLC data for tickers from Cryptocompare
for i in range(len(tickers_list)):
    res_cc = s.get(url+"fsym="+tickers_list[i]+"&tsym="+tsym +"&limit=10&api_key="+api_key).content
    data = json.loads(res_cc)
    df_cc_t= pd.json_normalize(data['Data'], record_path='Data')[cols]
    df_cc_t = df_cc_t.astype({'time': 'datetime64[s]'}).set_index('time')
    df_cc_t = df_cc_t.rename(columns={"close": tickers_list[i]+"_close", "high": tickers_list[i]+"_high", "low": tickers_list[i]+"_low"})
    df_cc = df_cc.set_index(df_cc_t)
    df_cc = df_cc.join(df_cc_t)

getting an error: TypeError: The parameter "keys" may be a column key, one-dimensional array, or a list containing only valid column keys and one-dimensional arrays.. Received column of type

df_cc expected output (random data)

            BTC_close BTC_high BTC_low ETH_close ....
2021-09-04.  49000.   49100.   48900.  3800
2021-09-05.  49500.   49600.   48900.  3900

not_speshal · Accepted Answer

Your error is occurring because you're trying to set an entire dataframe as an index using df_cc = df_cc.set_index(df_cc_t).

Instead, save all the dataframes to a list and use pd.concat after exiting the loop. Some other changes done to your code:

use enumerate to loop through your list
use f-string to format url
use add_prefix instead of rename
use pd.to_datetime instead of astype

list_of_dfs = list()
for i, ticker in enumerate(tickers_list):
    res_cc = s.get(f"{url}fsym={ticker}&tsym={tsym}&limit=10&api_key={api_key}").content
    data = json.loads(res_cc)
    df_cc_t= pd.json_normalize(data['Data'], record_path='Data')[cols]
    df_cc_t["time"] = pd.to_datetime(df_cc_t["time"], unit="s")
    df_cc_t = df_cc_t.set_index("time").add_prefix(f"{ticker}_")
    list_of_dfs.append(df_cc_t)

df_cc = pd.concat(list_of_dfs, axis=1)

>> df_cc
            BTC_close  BTC_high   BTC_low  ETH_close  ETH_high  ETH_low
time                                                                   
2021-08-27   49088.10  49166.31  46376.81    3275.87   3282.69  3057.59
2021-08-28   48918.90  49306.47  48386.90    3246.78   3287.97  3212.90
2021-08-29   48794.26  49653.49  47841.49    3225.30   3287.20  3154.86
2021-08-30   46993.31  48878.11  46872.47    3228.46   3348.62  3146.73
2021-08-31   47159.26  48252.74  46713.72    3435.17   3480.11  3190.71
2021-09-01   48840.01  49062.05  46544.74    3829.18   3841.79  3385.05
2021-09-02   49280.77  50374.89  48632.70    3787.70   3837.51  3724.10
2021-09-03   50019.51  51056.15  48362.56    3939.79   4026.86  3712.13
2021-09-04   49935.29  50549.35  49501.12    3887.01   3969.89  3835.04
2021-09-05   51784.16  51890.18  49499.28    3952.33   3982.63  3835.93
2021-09-06   51627.75  52209.97  51035.68    3934.38   3971.74  3872.94

Loop over json data and merge df data

Answers (1)

Related Questions