Golo Roden
Golo Roden

Reputation: 150614

Understanding the Web Driver API

If I understand W3C's information on the WebDriver API right, browsers that implement this API can be automated by using a RESTful API. That is, I can open an HTTP connection to a browser and send commands to it using REST.

I also found this Gist which confirms me that my guess is right: The Node.js script directly connects to a PhantomJS that has been started as a WebDriver server.

So far, so good.

What I now don't get is why apparently for any other browser I still need a Selenium server. Even projects such as webdriverjs require me to run a Selenium server.

My question is just: Why?

Shouldn't it be possible to automate a browser without an extra Selenium server? Should browsers not be able to provide this API directly (as PhantomJS obviously does)?

Can anyone shed some light into this please?

Upvotes: 1

Views: 171

Answers (1)

JimEvans
JimEvans

Reputation: 27496

You misunderstand the W3C spec. Though section 2.6 of the specification declares that implementers must provide a JSON-over-HTTP-accessible "remote end" of the protocol, it also declares that the implementation:

MAY take the form of a standalone executable that translates the JSON over HTTP protocol to the encoding and transport mechanism used by the remote end.

So, simply put, no, it may not be the case that you may simply fire up the browser instance and expect to connect to it via HTTP. While PhantomJS does include its WebDriver implementation as part of the browser executable, you may require a separate executable for this functionality. As an example, automating Chrome requires an instance of a separate chromedriver executable, which implements the HTTP server portion of the protocol.

Furthermore, it is important to note that the specification is currently at a working draft stage, and has not yet reached last call or candidate recommendation status. This means that all browser vendors may not yet have published an implementation of this specification. While there are currently in-progress implementations being done by Mozilla for Firefox, and a recently published one for Internet Explorer from Microsoft, neither of these implementations is complete yet, and both of those implementations in particular require external executables for access via HTTP, just as Chrome does.

Thus, for the moment, using the Selenium server is your only option for browsers that do not currently supply an HTTP implementation. That list from the major desktop browser vendors at the moment includes Firefox, Internet Explorer, and Safari.

Upvotes: 5

Related Questions