Reputation: 1107
I'm trying to use PhantomJS (in Windows) with Selenium in Python to do some web scraping.
I've downloaded the latest PhantomJS build from the website, then unzip it. After that I've tried
from selenium import webdriver
browser = webdriver.PhantomJS()
The response was
WebDriverException: 'phantomjs' executable needs to be in PATH.
Then I've tried to add the path, like
browser = webdriver.PhantomJS('path_to/phantomjs.exe')
I've also tried to put an 'r' before the path. The response was an exxeption in HTML:
WebDriverException: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Notification: Gateway Timeout</title>
<style type="text/css">
body {
font-family: Arial, Helvetica, sans-serif;
font-size: 14px;
color:#333333;
background-color: #ffffff;
}
h1 {
font-size: 18px;
font-weight: bold;
text-decoration: none;
padding-top: 0px;
color: #2970A6;
}
a:link {
color: #2970A6;
text-decoration: none;
}
a:hover {
color: #2970A6;
text-decoration: underline;
}
p.buttonlink {
margin-bottom: 24px;
}
.copyright {
font-size: 12px;
color: #666666;
margin: 5px 5px 0px 30px;
}
.details {
font-size: 14px;
color: #969696;
border: none;
padding: 20px 20px 20px 20px;
margin: 0px 10px 10px 35px;
}
.shadow {
border: 3px solid #9f9f9f;
padding: 10px 25px 10px 25px;
margin: 10px 35px 0px 30px;
background-color: #ffffff;
width: 600px;
-moz-box-shadow: 3px 3px 3px #cccccc;
-webkit-box-shadow: 3px 3px 3px #cccccc;
box-shadow: 3px 3px 3px #cccccc;
/* For IE 8 */
-ms-filter: "progid:DXImageTransform.Microsoft.Shadow(Strength=5, Direction=135, Color='cccccc')";
/* For IE 5.5 - 7 */
filter: progid:DXImageTransform.Microsoft.Shadow(Strength=5, Direction=135, Color='cccccc');
}
.logo {
border: none;
margin: 5px 5px 0px 30px;
}
</style>
</head>
<body>
<div class="logo"></div><p> </p>
<div class="shadow">
<h1>This Page Cannot Be Displayed</h1>
<p>
The system cannot communicate with the external server ( 127.0.0.1 ).
The Internet server may be busy, may be permanently down, or may be
unreachable because of network problems.
</p>
<p>
Please check the spelling of the Internet address entered.
If it is correct, try this request later.
</p>
<p>
If you have questions, please contact
your corporate network administrator
and provide the codes shown below.
</p>
</div>
<div class="details"><p>
Date: Mon, 30 May 2016 12:30:14 CEST<br />
Username: <br />
Source IP: 10.202.210.98<br />
URL: POST http://127.0.0.1/wd/hub/session<br />
Category: Uncategorized URLs<br />
Reason: UNKNOWN<br />
Notification: GATEWAY_TIMEOUT
</p></div>
</body>
</html>
I've open this code into a Chrome session and it opens the firewall page of my corporate. The message is "The system cannot communicate with the external server ( 127.0.0.1 ).". I can webscrape with Chrome or Firefox drivers, but I have this problem with PhantomJS.
Can you help me?
Upvotes: 2
Views: 2537
Reputation: 2854
try to use absolute path as below where you have installed and also set 'NO_PROXY' environment for '127.0.0.1'
os.environ['NO_PROXY'] = '127.0.0.1'
driver = webdriver.PhantomJS(
executable_path=r'C:\Python\Python35-32\Lib\site-packages\phantomjs-2.1.1-windows\bin\phantomjs')
Upvotes: 2
Reputation: 71
I'm trying to get this working under Windows, too...
I'm superclassing the WebDriver, and I'm passing the ABSOLUTE path the executable is in into the __init__ method for the webdriver when initializing the superclass.
This has got me further - now I'm seeing
Exception WebDriverException: Message: 'phantomjs' executable may have wrong permissions.
which gives me the impression I'm on the right track... that suggests you could just pass the path the executable is in into the constructor for the webdriver object (as a simple string).
Upvotes: 0