Reputation: 1439
I am using a python package (py_ball, a python api wrapper for the nba's stats site) and I have a question regarding obtaining headers when accessing an api.
A tutorial for the package uses the following dict as headers for the requests call
HEADERS = {'Connection': 'keep-alive',
'Host': 'stats.nba.com',
'Origin': 'http://stats.nba.com',
'Upgrade-Insecure-Requests': '1',
'Referer': 'stats.nba.com',
'x-nba-stats-origin': 'stats',
'x-nba-stats-token': 'true',
'Accept-Language': 'en-US,en;q=0.9',
"X-NewRelic-ID": "VQECWF5UChAHUlNTBwgBVw==",
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6)' +\
' AppleWebKit/537.36 (KHTML, like Gecko)' + \
' Chrome/81.0.4044.129 Safari/537.36'}
And I became interested in how these headers are found (for example if I wanted to pull data from a different site I figured I'd need different headers). After some searching, I found an example. Using this link:
https://www.nba.com/stats/players/traditional?PerMode=Totals&sort=PTS&dir=-1
And after inspecting element and clicking the Network tab, I find the link to the data I want
https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2022-23&SeasonSegment=&SeasonType=Pre%20Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=
Now I just need the headers, and I believe I came close. If you click the link of interest when under the Network tab while inspecting, then click Headers, you can see the different types of headers including the Request headers. However, these headers are 1. slightly different from the headers provided by the tutorial and 2. not easily obtainable (no way to copy them as far as I could tell)
So my question is, how did the tutorial know to use those specific headers, and where did they find them? Also, is there a way to copy headers in Chrome when using Inspect Element or in python via the requests package?
For additional context, here is the code I used to successfully obtain the data without the use of the package
import pandas as pd
import requests
import json
# tutorial's headers
HEADERS = {'Connection': 'keep-alive',
'Host': 'stats.nba.com',
'Origin': 'http://stats.nba.com',
'Upgrade-Insecure-Requests': '1',
'Referer': 'stats.nba.com',
'x-nba-stats-origin': 'stats',
'x-nba-stats-token': 'true',
'Accept-Language': 'en-US,en;q=0.9',
"X-NewRelic-ID": "VQECWF5UChAHUlNTBwgBVw==",
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6)' +\
' AppleWebKit/537.36 (KHTML, like Gecko)' + \
' Chrome/81.0.4044.129 Safari/537.36'}
url = 'https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2022-23&SeasonSegment=&SeasonType=Pre%20Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight='
r = requests.get(url, headers=HEADERS)
df = (pd.
DataFrame(json.loads(r.text)['resultSets'][0]['rowSet'],
columns=json.loads(r.text)['resultSets'][0]['headers'])
)
I would just like to know how to obtain the correct headers because obviously there might not be a tutorial for what I would be trying to do in any potential future projects.
Upvotes: 1
Views: 3949
Reputation: 81
there's plenty of youtube tutorials that show how to get request properties from the network tab of inspector, like this one: https://www.youtube.com/watch?v=mMjzEI27xDI
a quick curl > python (or any other language) i like is this: https://curlconverter.com/
Upvotes: 3