Reputation: 1
learning to work with the request library and pandas but have been struggling to get past the starting point even with a good amount of examples online.
I am trying to extract NBA shot data from the URL below using a GET request, and then turn it into a DataFrame:
def extractData():
Harden_data_url = "https://stats.nba.com/events/?flag=3&CFID=33&CFPARAMS=2017-18&PlayerID=201935&ContextMeasure=FGA&Season=2017-18§ion=player&sct=hex"
response = requests.get(Harden_data_url)
data = response.json()
shots = data['resultSets'][0]['rowSet']
headers = data['resultSets'][0]['headers']
df = pandas.DataFrame.from_records(shots, columns = headers)
However I get this error starting on line 2 "response = requests.get(url)"
ValueError: No JSON object could be decoded
I imagine I am missing something basic, any debugging help is appreciated!
Upvotes: 0
Views: 103
Reputation: 1862
The problem is that you are using the wrong URL for fetching the data.
The URL you used was for the HTML, which is in charge of the layout of the site. The data comes from a different URL, which fetches it in JSON format.
The correct URL for the data you are looking for is this:
https://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2017-18&ContextMeasure=FGA&DateFrom=&DateTo=&EndPeriod=10&EndRange=28800&GameID=&GameSegment=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OnOff=&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=201935&PlayerPosition=&RangeType=0&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&StartPeriod=1&StartRange=0&TeamID=0&VsConference=&VsDivision=
If you run it on the browser, you can see only the raw JSON data, which is exactly what you will get in your code, and make it work properly.
This blog post explains the method to find the data URL, and although the API has changed a little since the post was written, the method still works: http://www.gregreda.com/2015/02/15/web-scraping-finding-the-api/
Upvotes: 2