Reputation: 35
I'm trying to make a dataframe from a json file without success. I'm not exactly sure why, as is my first time dealing with json files, but from what I found while looking for a solution it seems the data is heavily nested.
Here's the json https://stats.nba.com/stats/leagueLeaders?ActiveFlag=No&LeagueID=00&PerMode=Totals&Scope=S&Season=All+Time&SeasonType=Regular+Season&StatCategory=AST.
I've tried to create the df using pandas .read_json() with different orients, with open, and also using request. The request method actually gave me a dataframe although highly unconfigured.
AST_LDR = pd.read_json('C:\\Users\\user\\Desktop\\python\\AssistLeaders.json')
#Error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-20-55a99c3e3802> in <module>
----> 1 AST_LDR = pd.read_json('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json')
2
3
~\Anaconda3\lib\site-packages\pandas\io\json\json.py in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit, encoding, lines, chunksize, compression)
425 return json_reader
426
--> 427 result = json_reader.read()
428 if should_close:
429 try:
~\Anaconda3\lib\site-packages\pandas\io\json\json.py in read(self)
535 )
536 else:
--> 537 obj = self._get_object_parser(self.data)
538 self.close()
539 return obj
~\Anaconda3\lib\site-packages\pandas\io\json\json.py in _get_object_parser(self, json)
554 obj = None
555 if typ == 'frame':
--> 556 obj = FrameParser(json, **kwargs).parse()
557
558 if typ == 'series' or obj is None:
~\Anaconda3\lib\site-packages\pandas\io\json\json.py in parse(self)
650
651 else:
--> 652 self._parse_no_numpy()
653
654 if self.obj is None:
~\Anaconda3\lib\site-packages\pandas\io\json\json.py in _parse_no_numpy(self)
869 if orient == "columns":
870 self.obj = DataFrame(
--> 871 loads(json, precise_float=self.precise_float),
dtype=None)
872 elif orient == "split":
873 decoded = {str(k): v for k, v in compat.iteritems(
ValueError: Expected object or value
#------------
import json
with open('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json') as f:
data = json.load(f)
AST_LDR = pd.DataFrame(data)
#Error
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-19-f61dd4edbb9e> in <module>
1 import json
2 with open('C:\\Users\\user\\Desktop\\python\\Kobe Bryant\\AssistLeaders.json') as f:
----> 3 data = json.load(f)
4 AST_LDR = pd.DataFrame(data)
~\Anaconda3\lib\json\__init__.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
294 cls=cls, object_hook=object_hook,
295 parse_float=parse_float, parse_int=parse_int,
--> 296 parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
297
298
~\Anaconda3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
346 parse_int is None and parse_float is None and
347 parse_constant is None and object_pairs_hook is None and not kw):
--> 348 return _default_decoder.decode(s)
349 if cls is None:
350 cls = JSONDecoder
~\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()
339 if end != len(s):
~\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Upvotes: 1
Views: 517
Reputation: 93151
JSON is a very flexible format. pd.read_json
accepts only a few formats and more often than not, the actual data does not fit any of those. You are better off treating it as a dictionary, extracting the needed data and construct your data frame accordingly:
url = 'https://stats.nba.com/stats/leagueLeaders?ActiveFlag=No&LeagueID=00&PerMode=Totals&Scope=S&Season=All+Time&SeasonType=Regular+Season&StatCategory=AST'
data = requests.get(url).json()
df = pd.DataFrame(data['resultSet']['rowSet'], columns=data['resultSet']['headers'])
Result:
PLAYER_ID PLAYER_NAME GP MIN FGM FGA FG_PCT FG3M FG3A FG3_PCT FTM FTA FT_PCT OREB DREB REB AST STL BLK TOV PF PTS AST_TOV STL_TOV EFG_PCT TS_PCT GP_RANK MIN_RANK FGM_RANK FGA_RANK FG_PCT_RANK FG3M_RANK FG3A_RANK FG3_PCT_RANK FTM_RANK FTA_RANK FT_PCT_RANK OREB_RANK DREB_RANK REB_RANK AST_RANK STL_RANK BLK_RANK TOV_RANK PF_RANK PTS_RANK AST_TOV_RANK STL_TOV_RANK EFG_PCT1 TS_PCT1
0 304 John Stockton 1504 47766 7039 13658 0.515 845.0 2202.0 0.384 4788 5796 0.826 966.0 3085.0 4051 15806 3265.0 315.0 4244.0 3942 19711 3.724 0.769 0.546 0.608 4 9 69 93 109 153 169 88 40 48 183 407 246 371 1 1 400 2 14 45 5 164 50 18
1 467 Jason Kidd 1391 50116 6219 15557 0.400 1988.0 5701.0 0.349 3103 3954 0.785 1768.0 6957.0 8725 12091 2684.0 450.0 4003.0 2572 17529 3.020 0.670 0.464 0.507 11 5 101 56 1190 10 7 293 149 152 447 138 27 56 2 2 284 5 171 85 30 275 881 888
2 959 Steve Nash 1217 38073 6321 12892 0.490 1685.0 3939.0 0.428 3060 3384 0.904 643.0 2999.0 3642 10335 899.0 102.0 3478.0 1982 17387 2.972 0.258 0.556 0.605 36 46 96 116 267 22 35 13 154 221 3 601 262 436 3 222 861 13 411 87 37 1030 32 22
3 349 Mark Jackson 1296 39117 4793 10731 0.447 734.0 2213.0 0.332 2169 2818 0.770 1281.0 3682.0 4963 10334 1608.0 117.0 3155.0 2230 12489 3.275 0.510 0.481 0.522 23 38 227 207 748 193 165 413 324 322 561 273 176 257 4 32 806 23 289 223 19 571 650 679
4 77142 Magic Johnson 906 33245 6211 11951 0.520 325.0 1074.0 0.303 4960 5850 0.848 1601.0 4958.0 6559 10141 1724.0 374.0 3506.0 2050 17707 2.892 0.492 0.533 0.610 227 98 102 145 91 392 369 520 37 45 91 180 80 138 5 21 345 11 368 80 46 625 98 16
Upvotes: 2