403 Forbidden Error when scraping a site, user-agents already used and updated. Any ideas?

Question

As the title above states I am getting a 403 error. The URLs generated are valid, I can print them and then open them in my browser just fine.

I've got a user agent, it's the exact same one that my browser sends when accessing the page I want to scrape pulled straight from chrome devtools. I've tried using sessions instead of a straight request, I've tried using urllib, and I've tried using a generic request.get.

Here's the code I'm using, that 403s. Same result with request.get etc.

headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36'}

session = requests.Session()
req = session.get(URL, headers=headers)

So yeah, I assume I'm not creating the useragent write so it can tell I am scraping. But I'm not sure what I'm missing, or how to find that out.

furas · Accepted Answer

I got all headers from DevTools and I started removing headers one by one and I found it needs only Accept-Language and it doesn't need User-Agent and it doesn't need Session.

import requests

url = 'https://www.g2a.com/lucene/search/filter?&search=The+Elder+Scrolls+V:+Skyrim¤cy=nzd&cc=NZD'

headers = {
    'Accept-Language': 'en-US;q=0.7,en;q=0.3',
}

r = requests.get(url, headers=headers)

data = r.json()

print(data['docs'][0]['name'])

Result:

The Elder Scrolls V: Skyrim Special Edition Steam Key GLOBAL

403 Forbidden Error when scraping a site, user-agents already used and updated. Any ideas?

Answers (2)

Related Questions