Reputation: 2020
I'm trying to set a proxy for webscraping using selenium + phantomjs. I'm using python.
I've seen in many places that there is a bug in phantomjs such that proxy-auth does not work.
from selenium.webdriver.common.proxy import *
from selenium import webdriver
from selenium.webdriver.common.by import By
service_args = [
'--proxy=http://fr.proxymesh.com:31280',
'--proxy-auth=USER:PWD',
'--proxy-type=http',
]
driver = webdriver.PhantomJS(service_args=service_args)
driver.get("https://www.google.com")
print driver.page_source
Proxy mesh suggests using the following instead:
page.customHeaders={'Proxy-Authorization': 'Basic '+btoa('USERNAME:PASSWORD')};
but I'm not sure how to translate that into python.
This is what I currently have:
from selenium import webdriver
import base64
from selenium.webdriver.common.proxy import *
from selenium import webdriver
from selenium.webdriver.common.by import By
service_args = [
'--proxy=http://fr.proxymesh.com:31280',
'--proxy-type=http',
]
headers = { 'Proxy-Authorization': 'Basic ' + base64.b64encode('USERNAME:PASSWORD')}
for key, value in enumerate(headers):
webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value
driver = webdriver.PhantomJS(service_args=service_args)
driver.get("https://www.google.com")
print driver.page_source
but it doesn't work.
Any suggestions for how I could get this to work?
Upvotes: 6
Views: 3423
Reputation: 21
None of the above methods worked for me, I am using ProxyMeshproxies with selenium phantomJs python. and Following parameters worked for me because it resolved the error proxy authentication failed
.
service_args=['--proxy=http://username:password@host:port',
'--proxy-type=http',
'--proxy-auth=username:password']
driver = webdriver.PhantomJS(service_args=service_args)
Upvotes: 0
Reputation: 2909
The solution with DesiredCapabilities
didn't work for me.
I have ended up with the following solution:
from selenium import webdriver
driver = webdriver.PhantomJS(executable_path=config.PHANTOMJS_PATH,
service_args=['--ignore-ssl-errors=true',
'--ssl-protocol=any',
'--proxy={}'.format(self.proxy),
'--proxy-type=http',
'--proxy-auth={}:{}'.format(self.proxy_username, self.proxy_password)])
Upvotes: 4
Reputation:
I'm compiling answers from: How to correctly pass basic auth (every click) using Selenium and phantomjs webdriver as well as: base64.b64encode error
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import base64
service_args = [
'--proxy=http://fr.proxymesh.com:31280',
'--proxy-type=http',
]
authentication_token = "Basic " + base64.b64encode(b'username:password')
capa = DesiredCapabilities.PHANTOMJS
capa['phantomjs.page.customHeaders.Proxy-Authorization'] = authentication_token
driver = webdriver.PhantomJS(desired_capabilities=capa, service_args=service_args)
driver.get("http://...")
Upvotes: 5