Reputation: 21
I am currently using selenium and crawling a website. I have tested if I could set a proxy server on Selenium.
But now, I want to set a paid rental proxy server and I got a trial IP address whose the format looks like this IP:PORT:USER:PASS
.
And I don't know how to set USER:PASS. The provider didn't know how to set in Selenium.
So I don't know what I can do now.
With random proxy this worked fine.
proxy_host = '185.186.61.44'
proxy_port = '11334'
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument("--proxy-server=http://#{proxy_host}:#{proxy_port}")
So I wanted to set something like this.
proxy_host = '185.186.61.44'
proxy_port = '12323'
proxy_user = "7a2345129"
proxy_pass = "easdga341d4"
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument("--proxy-server=http://#{proxy_host}:#{proxy_port}:#{proxy_user}:#{proxy_pass}")
but I found that it was not that easy as I read some solution that uses puppeteer.
I wonder if there are any solution for my case.
If anybody has any clues I would love you to tell me.
Thank you.
Upvotes: 2
Views: 1094
Reputation: 523
Selenium 4 added support for basic auth, which at the time of writing is Chrome specific.
See here for more details.
To specify basic auth creds:
driver.devtools.new
driver.register(username: 'username', password: 'password')
Example using scraperapi.com as proxy
require 'selenium-webdriver'
proxy = Selenium::WebDriver::Proxy.new(
http: 'proxy-server.scraperapi.com:8001',
ssl: 'proxy-server.scraperapi.com:8001'
)
cap = Selenium::WebDriver::Remote::Capabilities.chrome(proxy: proxy)
options = Selenium::WebDriver::Chrome::Options.new(
args: [
'--no-sandbox',
'--headless',
'--disable-dev-shm-usage',
'--single-process',
'--ignore-certificate-errors'
]
)
driver = Selenium::WebDriver.for(:chrome, capabilities: [options,cap])
driver.devtools.new
driver.register(username: 'scraperapi', password: 'xxxx')
driver.navigate.to("http://httpbin.org/ip")
puts "content: #{driver.page_source}"
The Chrome::Options
above are specific to my usecase, expect for the ignore-certificate-errors
option which is needed to handle https traffic using scraperapi's proxies.
gemfile had:
gem 'selenium-devtools', '~> 0.91.0'
gem 'selenium-webdriver', '~> 4.1'
Upvotes: 1
Reputation: 461
The format of an URL is most like
proto://USER:[email protected]:port
So your code have to look like :
proxy_host = '185.186.61.44'
proxy_port = '12323'
proxy_user = "7a2345129"
proxy_pass = "easdga341d4"
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument("--proxy-server=http://{proxy_user}:#{proxy_pass}@#{proxy_host}:#{proxy_port}:#")
Upvotes: 0