Reputation: 461
Check out this gist for details: https://gist.github.com/stockninja/7b9bcbfc8f338da414ae9678ec98016d
The offending script is called main.py
, and I included output of some attempts I made to understand what's failing.
I'm completely at a loss as to what's failing!
Upvotes: 2
Views: 74
Reputation: 180391
Very simple solution,nothing to do with proxies, you need to add a user-agent:
def sync():
head = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"}
all_urls = urls()
for url in all_urls:
res = requests.get(url, headers=head)
print(url)
print(res.json())
Once you do:
In [2]: sync()
http://stats.nba.com/stats/teamgamelog?TeamID=1610612737&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612737, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
http://stats.nba.com/stats/teamgamelog?TeamID=1610612738&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612738, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
http://stats.nba.com/stats/teamgamelog?TeamID=1610612751&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612751, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
http://stats.nba.com/stats/teamgamelog?TeamID=1610612766&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612766, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
http://stats.nba.com/stats/teamgamelog?TeamID=1610612741&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612741, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
http://stats.nba.com/stats/teamgamelog?TeamID=1610612739&Season=2016-15&SeasonType=Regular+Season
{'resource': 'teamgamelog', 'parameters': {'TeamID': 1610612739, 'Season': '2016-15', 'LeagueID': None, 'SeasonType': 'Regular Season'}, 'resultSets': [{'name': 'TeamGameLog', 'rowSet': [], 'headers': ['Team_ID', 'Game_ID', 'GAME_DATE', 'MATCHUP', 'WL', 'MIN', 'FGM', 'FGA', 'FG_PCT', 'FG3M', 'FG3A', 'FG3_PCT', 'FTM', 'FTA', 'FT_PCT', 'OREB', 'DREB', 'REB', 'AST', 'STL', 'BLK', 'TOV', 'PF', 'PTS']}]}
And so on ....................
Without the you get a 400 every time, you might also want to consider a sleep between requests and to investigate the rate limit/requests per day.
Upvotes: 4