sudonym
sudonym

Reputation: 4028

How to disable screenshots and javascript for PhantomJS in python selenium?

I am scraping in a python/selenium framework using phantomJS on windows. First, I tried to disable javascript and screenhsots with selenium:

driver = webdriver.PhantomJS("phantomjs.exe", desired_capabilities = dcap)
webdriver.DesiredCapabilities.PHANTOMJS["phantomjs.page.settings.javascriptEnabled"] = False
webdriver.DesiredCapabilities.PHANTOMJS["phantomjs.takesScreenshot"] = False
webdriver.DesiredCapabilities.PHANTOMJS["phantomjs.page.clearMemoryCash"] = False

However, when I have a look at ghostdriver.log, Session.negotiatedCapabilities includes:

browserName:phantomjs
version:2.1.1
driverName:ghostdriver
driverVersion:1.2.0
platform:windows-7-32bit
javascriptEnabled:true   # Should be false
takesScreenshot:true     # Should be false

Therefore, I think I need to disable both parameters during onInitialized=function(), similar to the below code snippet:

phantom_exc_uri='/session/$sessionId/phantom/execute'
driver.command_executor._commands['executePhantomScript'] = ('POST', phantom_exc_uri)
initScript="""             
this.onInitialized=function() {
    var page=this;
   ### disable javascript and screenshots here ###
}
"""
driver.execute('executePhantomScript',{'script': initScript, 'args': []})

Q1: How come I can set some phantomJS specs in webdriver.DesiredCapabilities, but others not? Is this my mistake or some bug?

Q2: Is it reasonable to accomplish this during onInitialized or am I on the wrong way?

Q2: If so, how to disable JS and screenshots during onInitialized?

Upvotes: 1

Views: 450

Answers (1)

undetected Selenium
undetected Selenium

Reputation: 193338

You have raised quite a few queries in your question. Let me try to address them all. A simple workflow with Selenium v3.8.1, ghostdriver v1.2.0 and phantomjs v2.1.1 Browser shows us that the following Session.negotiatedCapabilities are passed by default :

  • "browserName":"phantomjs"
  • "version":"2.1.1"
  • "driverName":"ghostdriver"
  • "driverVersion":"1.2.0"
  • "platform":"windows-8-32bit"
  • "javascriptEnabled":true
  • "takesScreenshot":true
  • "handlesAlerts":false
  • "databaseEnabled":false
  • "locationContextEnabled":false
  • "applicationCacheEnabled":false
  • "cssSelectorsEnabled":true
  • "webStorageEnabled":false
  • "rotatable":false
  • "acceptSslCerts":false
  • "nativeEvents":true
  • "proxy":{"proxyType":"direct"}}

So by default it was mandated that to establish a successful session through PhantomJSDriver and Ghost Browser combination the following Capabilities were a minimum requirement.

Then the users had the DesiredCapabilities class at their disposal to tweak the capabilities. But there are certain capabilities which are minimum requirement to create a successful Ghost Browser session.

javascriptEnabled is such a property which is mandatory. Till a few releases back Selenium did allow to tweak the javascriptEnabled attribute to false. But now WebDriver being a W3C Recommendation Candidate the mandatory capabilities cannot be over-ridden anymore through DesiredCapabilities at user level.

Even if you try to tweak them at user level, WebDriver will override them to default while configuring the capabilities.

So, though you have tried the following :

webdriver.DesiredCapabilities.PHANTOMJS["phantomjs.page.settings.javascriptEnabled"] = False
webdriver.DesiredCapabilities.PHANTOMJS["phantomjs.takesScreenshot"] = False

The properties javascriptEnabled and takesScreenshot defaults to required mandatory configuration.


Update

As you mentioned in your comment What about changing those AFTER the Ghostdriver session is established, i.e. page.onInitialized the straight answer is No.

Once the capabilities are freezed and negotiated to initialize a Browsing Session the capabilities holds true till the particular session is active. So you can't change any of the capabilities once the session is established. To change the capabilities you have to configure the WebDriver instance again.

Upvotes: 1

Related Questions