Reputation: 55
I am trying to build my first website scraper and am very new to Python and programming in general. I am trying to practice scraping a website but my code does not work for some reason. See code below. When I run the code it returns the html for google.com not the County Assessors page.
Is this an issue with my Python code or is there some code on the County Assessors page that is rerouting me to google? How do I fix this issue? Any help is much appreciated. Thanks.
#IMPORT LIBRARIES
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests
#SCRAPER CODE
web_page = 'https://mcassessor.maricopa.gov/index.php'
page = urlopen(web_page)
soup = BeautifulSoup(page,'html.parser')
print (soup)
Upvotes: 0
Views: 78
Reputation: 473893
There is just this User-Agent
header check you need to pass:
from bs4 import BeautifulSoup
import requests
web_page = 'https://mcassessor.maricopa.gov/index.php'
response = requests.get(web_page, headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
})
soup = BeautifulSoup(response.content, 'html.parser')
print (soup.prettify())
Upvotes: 1