Reputation: 982
I'm scraping some average salary data to make infographics from a list of jobs. If the job can be found, like "programmer", then it gives me a code 200 and the page I go to is the same in the script.
import requests
job_url: str = "https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Programmer-Salary-by-State"
job_response = requests.get(job_url, timeout=10)
print(job_response)
If it fails like below for "Youtuber", I want to display an error message to the user. But, I still get a code 200. Manually trying this, their site redirects me to a page like "https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Youtuber-Salary-by-State?ind=null"
null_url: str = "https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Youtuber-Salary-by-State"
null_response = requests.get(null_url, timeout=10)
How can I in code figure out if the query is redirecting to an empty page? Do I need to use another library?
Upvotes: 0
Views: 47
Reputation: 12265
You can disable redirection and check the response:
null_url = "https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Youtuber-Salary-by-State"
null_response = requests.get(null_url, timeout=10, allow_redirects=False)
if null_response.status_code == 301:
print("Not found")
if "Moved Permanently" in null_response.text:
print("Not found")
if "ind=null" in null_response.next.url:
print("Not found")
Or with redirections:
null_url = "https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Youtuber-Salary-by-State"
null_response = requests.get(null_url, timeout=10)
if "ind=null" in null_response.url:
print("Not found")
if null_response.history[0].status_code == 301:
print("Not found")
Upvotes: 2