Alexander Thomsen
Alexander Thomsen

Reputation: 469

Can't figure out why I get error when trying to get JSON value into DF

Been looking online but can't figure out why I'm getting the error as the data is available in the JSON.

I'm trying to extract "pull_request_contributors" value from JSON and put into DF.

I get the error:

KeyError: "Try running with errors='ignore' as key 'pull_request_contributors' is not always present"

Code

cg = CoinGeckoAPI()

ts = '01-01-2017'
cs = 'bitcoin'

# get data
result = cg.get_coin_history_by_id(cs, ts)

#pull_request_contributors
df_pr = pd_json.json_normalize(data, 
                            record_path='developer_data', 
                            meta=['pull_request_contributors']).set_index(ts)

JSON

{'community_data': {'facebook_likes': 40055,
  'reddit_accounts_active_48h': '4657.4',
  'reddit_average_comments_48h': 186.5,
  'reddit_average_posts_48h': 3.75,
  'reddit_subscribers': 1014816,
  'twitter_followers': 64099},
 'developer_data': {'closed_issues': 3845,
  'commit_count_4_weeks': 245,
  'forks': 22024,
  'pull_request_contributors': 564,
  'pull_requests_merged': 6163,
  'stars': 36987,
  'subscribers': 3521,
  'total_issues': 4478}...

Expectation

date        bitcoin 
01-01-2017  564

Upvotes: 0

Views: 1448

Answers (1)

Vladimir Atanasov
Vladimir Atanasov

Reputation: 189

Since the field pull_request_contributors is not available in each object, pandas cannot build the dataframe. Run
df_pr = pd_json.json_normalize(data, record_path='developer_data', meta=['pull_request_contributors'], errors='ignore').set_index(ts) to ignore missing fields.

EDIT

json_normalized creates a table with all fields as columns and their values make the rows. So for what you want to achieve, I wouldn't go with json_normalize, since you know which particular field you want to read. Here's how I would do it

ts = '01-01-2017'
cs = 'bitcoin'

df_pr = pd_json.json_normalize(data['developer_data'])

df = pd.DataFrame(data=[{'date': ts, 
                        cs: data['developer_data']['pull_request_contributors']}]).set_index('date')

This way we simply construct the DataFrame, without first normalizing the response.

If the response is a string and not a dict, I don't know what the CoinGeckoAPI returns, you can decode it first with

import json

data = json.loads(json_string)

Hope this helps

Upvotes: 1

Related Questions