MYX1994
MYX1994

Reputation: 11

Selenium webdriver does not open the correct url, rather it opens a blank page

I am using selenium webdriver to try scrape information from realestate.com.au, here is my code:

from selenium.webdriver import Chrome from bs4 import BeautifulSoup

path = 'C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe'
url = 'https://www.realestate.com.au/buy'
url2 = 'https://www.realestate.com.au/property-house-nsw-castle+hill-134181706'
webdriver = Chrome(path)
webdriver.get(url)
soup = BeautifulSoup(webdriver.page_source, 'html.parser')
print(soup)

it works fine with URL but when I try to do the same to open url2, it opens up a blank page, and I checked the console get the following: "Failed to load resource: the server responded with a status of 429 () about:blank:1 Failed to load resource: net::ERR_UNKNOWN_URL_SCHEME 149e9513-01fa-4fb0-aad4-566afd725d1b/2d206a39-8ed7-437e-a3be-862e0f06eea3/fingerprint:1 Failed to load resource: the server responded with a status of 404 ()"

while opening up URL, I tried to search for anything, which also leads to a blank page like url2.

Upvotes: 1

Views: 1312

Answers (2)

Jeff Rainer
Jeff Rainer

Reputation: 123

It looks like the www.realestate.com.au website is using an Akamai security tool.

A quick DNS lookup shows that www.realestate.com.au resolves to dualstack.realestate.com.au.edgekey.net.

They are most likely using the Bot Manager product (https://www.akamai.com/us/en/products/security/bot-manager.jsp). I have encountered this on another website recently.

Typically rotating user agents and IP addresses (ideally using residential proxies) should do the trick. You want to load up the site with a "fresh" browser profile each time. You should also check out https://github.com/67-6f-64/akamai-sensor-data-bypass

Upvotes: 1

NeelD
NeelD

Reputation: 71

I think you should try adding driver.implicitly_wait(10) before your get line, as this will add an implicit wait, in case the page loads too slowly for the driver to pull the site. Also you should consider trying out the Firefox webdriver, since this bug appears to be only affecting chromium browsers.

Upvotes: 0

Related Questions