Reputation: 1511
I tried to do the usual task to scrape data from a website.
Because I found that the data in the website is actually HTTP requests, and I can see the json in the response, I want to scrape the json files directly.
Then I found this selenium-wire which said "Extends Selenium to give you the ability to inspect requests made by the browser."
It works as expected in the first page. I got the json file. But whenever I click the link in the webpage using the webdriver, the connection is broken, saying ERR_PROXY_CONNECTION_FAILED.
I tried to switch back to selenium. It works again (without the json download).
So, are there any potential problems I can check with? and
Any other way to get the json? (Request seems not working because the websites need log in)
Upvotes: 3
Views: 4936
Reputation: 26
I found out that I'm getting this error if script finished execution. I put time.sleep(1000) at the end of the script. While a script is still running, link clicks and opening new pages work normally.
Upvotes: 1
Reputation: 23
options = {
'connection_timeout': None,
'proxy': {
'http': 'http://username:password@host:port',
'https': 'https://username:password@host:port',
'no_proxy': 'localhost,127.0.0.1,dev_server:8080'
}
}
Upvotes: 0
Reputation: 23
selenium wire works both for proxy authentication and without it
with authentication
options = {
'proxy': {
'http': 'http://username:password@host:port',
'https': 'https://username:password@host:port',
'no_proxy': 'localhost,127.0.0.1,dev_server:8080'
}
}
driver = webdriver.Firefox(seleniumwire_options=options)
without authentication
options = {
'proxy': {
'http': 'http://host:port',
'https': 'https://host:port',
'no_proxy': 'localhost,127.0.0.1,dev_server:8080',
'custom_authorization': 'Bearer mytoken123' # Custom Proxy-Authorization header value
}
}
driver = webdriver.Firefox(seleniumwire_options=options)
Upvotes: 0