Nick Benelli
Nick Benelli

Reputation: 111

Python Scrape NBA Tracking Drives Data

I am fairly new to Python. I am trying to scrape NBA Drives data via https://stats.nba.com/players/drives/

I used Chrome Devtools to find the API URL. I then used the requests package to get the JSON string.

Original code:

import requests
headers = {"User-Agent": "Mozilla/5.0..."}
url = " https://stats.nba.com/stats/leaguedashptstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&Height=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PerMode=PerGame&PlayerExperience=&PlayerOrTeam=Player&PlayerPosition=&PtMeasureType=Drives&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight="
r = requests.get(url, headers = headers)
d = r.json()

This no longer works, however. For some reason the request for the URL link below times out on the NBA server. So I need to find a new way to get this information.

< https://stats.nba.com/stats/leaguedashptstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&Height=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PerMode=PerGame&PlayerExperience=&PlayerOrTeam=Player&PlayerPosition=&PtMeasureType=Drives&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=>

I was exploring Chrome devtools and I found out that the desired JSON string was stored in the Network XHR Response tab. Is there any way to scrape that into python. See the image below.

Chrome Devtools: XHR Response JSON string

Upvotes: 0

Views: 471

Answers (1)

furas
furas

Reputation: 142793

I tested url with other headers (which I saw in DevTool for this request) and it seems it needs header Referer to work correctly

EDIT 2020.08.15:

I had to add new headers to read it

'x-nba-stats-origin': 'stats',
'x-nba-stats-token': 'true',

import requests

headers = {
    'User-Agent': 'Mozilla/5.0',
    #'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0',
    'Referer': 'https://stats.nba.com/players/drives/',
    #'Accept': 'application/json, text/plain, */*',

    'x-nba-stats-origin': 'stats',
    'x-nba-stats-token': 'true',
}

url = 'https://stats.nba.com/stats/leaguedashptstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&Height=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PerMode=PerGame&PlayerExperience=&PlayerOrTeam=Player&PlayerPosition=&PtMeasureType=Drives&Season=2019-20&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight='
r = requests.get(url, headers=headers)
data = r.json()

print(data)

BTW: the same but with params as dictionary so it is easier to set different value

import requests

headers = {
    'User-Agent': 'Mozilla/5.0',
    #'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0',
    'Referer': 'https://stats.nba.com/players/drives/',
    #'Accept': 'application/json, text/plain, */*',

    'x-nba-stats-origin': 'stats',
    'x-nba-stats-token': 'true',
}

url = 'https://stats.nba.com/stats/leaguedashptstats'

params = {
    'College': '',
    'Conference': '',
    'Country': '',
    'DateFrom': '',
    'DateTo': '',
    'Division': '',
    'DraftPick': '',
    'DraftYear': '',
    'GameScope': '',
    'Height': '',
    'LastNGames': '0',
    'LeagueID': '00',
    'Location': '',
    'Month': '0',
    'OpponentTeamID': '0',
    'Outcome': '',
    'PORound': '0',
    'PerMode': 'PerGame',
    'PlayerExperience': '',
    'PlayerOrTeam': 'Player',
    'PlayerPosition': '',
    'PtMeasureType': 'Drives',
    'Season': '2019-20',
    'SeasonSegment': '',
    'SeasonType': 'Regular Season',
    'StarterBench': '',
    'TeamID': '0',
    'VsConference': '',
    'VsDivision': '',
    'Weight': '',
}

r = requests.get(url, headers=headers, params=params)
#print(r.request.url)
data = r.json()

print(data)

Upvotes: 2

Related Questions