Reputation: 27
I am just started learning web scraping using python Beautifulsoup
and requests library and using Pycharm tool.
import requests
from bs4 import BeautifulSoup
result1 = requests.get("https://www.grainger.com/")
print('result1 is '+ str(result1.status_code))
While I am using this website its keeps on loading and if I use google.com
it's giving output.
I wonder why I didn't get output for the above website?
Upvotes: 1
Views: 240
Reputation: 195603
To get status 200
from this site, specify User-Agent
HTTP header:
import requests
from bs4 import BeautifulSoup
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:81.0) Gecko/20100101 Firefox/81.0'}
result1 = requests.get("https://www.grainger.com/", headers=headers)
print('result1 is '+ str(result1.status_code))
Prints:
result1 is 200
The reason why this is works is because some sites will ignore requests that don't appear to be made from a web browser. By default, requests uses the User-Agent
python-requests
, so the website can tell you are not requesting the website from a web browser. The reason why your request hangs and eventually times out is likely because their server is ignoring your request.
Upvotes: 1
Reputation: 58
Hmm... there are a couple of things.
Upvotes: 0