Tayyab Vohra
Tayyab Vohra

Reputation: 1662

How to convert nested json into python dataframe

I want to convert my nested json format into pandas dataframe i have tried but my data is something looking like this which is not correct

enter image description here

I have tried to fetch the json and save inside the innings dictionary and tried to convert into the pandas which is not working in the proper format

innings is the dictionary which i am trying to convert into pandas dataframe but it is not converting in the proper format

this is my json something like this

{
  '1164223': [
    {
      'ball_limit': '300',
      'balls': '300',
      'batted': '1',
      'batting_team_id': '2591',
      'bowling_team_id': '1832',
      'bpo': '6',
      'byes': '1',
      'event': '0',
      'event_name': None,
      'extras': '11',
      'innings_number': '1',
      'innings_numth': '1st',
      'lead': '308',
      'legbyes': '4',
      'live_current': '0',
      'live_current_name': None,
      'minutes': None,
      'noballs': '0',
      'old_penalty_or_bonus': '0',
      'over_limit': '50.0',
      'over_limit_run_rate': '6.16',
      'over_split_limit': '0.0',
      'overs': '50.0',
      'overs_docked': '0',
      'penalties': '0',
      'penalties_field_end': '0',
      'penalties_field_start': '0',
      'run_rate': '6.16',
      'runs': '308',
      'target': '0',
      'wickets': '6',
      'wides': '6'
    },
    {
      'ball_limit': '300',
      'balls': '294',
      'batted': '1',
      'batting_team_id': '1832',
      'bowling_team_id': '2591',
      'bpo': '6',
      'byes': '0',
      'event': '0',
      'event_name': None,
      'extras': '10',
      'innings_number': '2',
      'innings_numth': '1st',
      'lead': '3',
      'legbyes': '1',
      'live_current': '1',
      'live_current_name': 'current innings',
      'minutes': None,
      'noballs': '1',
      'old_penalty_or_bonus': '0',
      'over_limit': '50.0',
      'over_limit_run_rate': '6.22',
      'over_split_limit': '0.0',
      'overs': '49.0',
      'overs_docked': '0',
      'penalties': '0',
      'penalties_field_end': '0',
      'penalties_field_start': '0',
      'run_rate': '6.34',
      'runs': '311',
      'target': '309',
      'wickets': '6',
      'wides': '8'
    }
  ],
  '1165045': [
    {
      'ball_limit': '300',
      'balls': '271',
      'batted': '1',
      'batting_team_id': '1003',
      'bowling_team_id': '2989',
      'bpo': '6',
      'byes': '0',
      'event': '1',
      'event_name': 'all out',
      'extras': '10',
      'innings_number': '1',
      'innings_numth': '1st',
      'lead': '169',
      'legbyes': '4',
      'live_current': '0',
      'live_current_name': None,
      'minutes': None,
      'noballs': '1',
      'old_penalty_or_bonus': '0',
      'over_limit': '50.0',
      'over_limit_run_rate': '3.38',
      'over_split_limit': '0.0',
      'overs': '45.1',
      'overs_docked': '0',
      'penalties': '0',
      'penalties_field_end': '0',
      'penalties_field_start': '0',
      'run_rate': '3.74',
      'runs': '169',
      'target': '0',
      'wickets': '10',
      'wides': '5'
    },
    {
      'ball_limit': '300',
      'balls': '239',
      'batted': '1',
      'batting_team_id': '2989',
      'bowling_team_id': '1003',
      'bpo': '6',
      'byes': '0',
      'event': '3',
      'event_name': 'target reached',
      'extras': '12',
      'innings_number': '2',
      'innings_numth': '1st',
      'lead': '1',
      'legbyes': '6',
      'live_current': '1',
      'live_current_name': 'current innings',
      'minutes': None,
      'noballs': '0',
      'old_penalty_or_bonus': '0',
      'over_limit': '50.0',
      'over_limit_run_rate': '3.40',
      'over_split_limit': '0.0',
      'overs': '39.5',
      'overs_docked': '0',
      'penalties': '0',
      'penalties_field_end': '0',
      'penalties_field_start': '0',
      'run_rate': '4.26',
      'runs': '170',
      'target': '170',
      'wickets': '3',
      'wides': '6'
    }
  ]
}

Upvotes: 0

Views: 311

Answers (1)

jezrael
jezrael

Reputation: 862511

Use concat with dictionary comprehension:

df = pd.concat({k: pd.DataFrame(v) for k, v in j.items()})
print (df)
          ball_limit balls batted batting_team_id bowling_team_id bpo byes  \
1164223 0        300   300      1            2591            1832   6    1   
        1        300   294      1            1832            2591   6    0   
1165045 0        300   271      1            1003            2989   6    0   
        1        300   239      1            2989            1003   6    0   

          event      event_name extras  ...  overs overs_docked penalties  \
1164223 0     0            None     11  ...   50.0            0         0   
        1     0            None     10  ...   49.0            0         0   
1165045 0     1         all out     10  ...   45.1            0         0   
        1     3  target reached     12  ...   39.5            0         0   

          penalties_field_end penalties_field_start run_rate runs target  \
1164223 0                   0                     0     6.16  308      0   
        1                   0                     0     6.34  311    309   
1165045 0                   0                     0     3.74  169      0   
        1                   0                     0     4.26  170    170   

          wickets wides  
1164223 0       6     6  
        1       6     8  
1165045 0      10     5  
        1       3     6  

[4 rows x 32 columns]

Another solution is loop in dict comprehension and add key of outer dicts for list of dicts, last pass to DataFrame contructor:

df = pd.DataFrame([dict(x, **{'_id':k}) for k, v in j.items() for x in v])
print (df)
       _id ball_limit balls batted batting_team_id bowling_team_id bpo byes  \
0  1164223        300   300      1            2591            1832   6    1   
1  1164223        300   294      1            1832            2591   6    0   
2  1165045        300   271      1            1003            2989   6    0   
3  1165045        300   239      1            2989            1003   6    0   

  event      event_name  ...  overs overs_docked penalties  \
0     0            None  ...   50.0            0         0   
1     0            None  ...   49.0            0         0   
2     1         all out  ...   45.1            0         0   
3     3  target reached  ...   39.5            0         0   

  penalties_field_end penalties_field_start run_rate runs target wickets wides  
0                   0                     0     6.16  308      0       6     6  
1                   0                     0     6.34  311    309       6     8  
2                   0                     0     3.74  169      0      10     5  
3                   0                     0     4.26  170    170       3     6  

[4 rows x 33 columns]

Upvotes: 1

Related Questions