Reputation: 49
So, I learned how Web Scraping works a few days ago and I was messing around today. I wanted to know how I could test if a page exists/doesn't exist. So, I looked it up and I found Python check if website exists. I'm using the requests
module
and I got this code from the answers:
import requests
request = requests.get('http://www.example.com')
if request.status_code == 200:
print('Web site exists')
else:
print('Web site does not exist')
I tried it out, and since example.com exists, it printed "Web site exists". However, I tried something I was sure wouldn't exist, like examplewwwwwww.com and it gave me this error. Why is it doing this and how can I keep it from printing out an error (and instead saying that the website does not exist)?
Upvotes: 2
Views: 8847
Reputation: 412
Just to list my way of doing it, maybe it can be of value for someone:
try:
response = requests.get('https://github.com')
if response.ok:
ready = 1
break
except requests.exceptions.RequestException:
print("Website not availabe...")
Upvotes: 2
Reputation: 1604
You have to enclose request.get
call with try/except
and handle various exceptions that might arise, one of which is ConnectionError
.
You get this because having response status_code
not equal to 200 and not being able to connect to desired HTTP address are two different things.
Here are the exceptions that you might encounter when making requests with requests
library.
Upvotes: 0
Reputation: 4106
Well you getting the error because the url you want to get is invalid, however you can easily check this with a try
- except
block as this one:
import requests
from requests.exceptions import MissingSchema
try:
request = requests.get('examplewwwwwww.com')
except MissingSchema:
print('The provided URL is invalid.')
Upvotes: 1
Reputation: 855
You can use try/except like this:
import requests
from requests.exceptions import ConnectionError
try:
request = requests.get('http://www.example.com')
except ConnectionError:
print('Web site does not exist')
else:
print('Web site exists')
Upvotes: 5