Reputation: 3034
This https://www.npmjs.com/package/phantom#functionality-details page says:
You can also pass command line switches to the phantomjs process by specifying additional args to phantom.create(), eg:
phantom.create '--load-images=no', '--local-to-remote-url-access=yes', (page) ->
or by specifying them in the options* object:
phantom.create {parameters: {'load-images': 'no', 'local-to-remote-url-access': 'yes'}}, (page) ->
These examples are only in coffee script and also they insinuate that the create function can take
create('string',function)
or
create([object object],function)
but really the first parameter expected is the function!
I really wanted to try http://phantomjs.org/api/command-line.html I might have the wrong idea but to me it looks like they can be used in the create function (right before you do the createPage), am I wrong?
I have tried several things, the most logical one is this:
var phantom = require('phantom');
phantom.create(function(browser){
browser.createPage(function(page){
page.open('http://example.com/req.php', function() {
});},{parameters:{'proxy':'98.239.198.83:21320'}});});
So the page gets opened. I know this because I am making req.php save the $_SERVER object to a txt pad but, the REMOTE_ADDR and REMOTE_PORT headers are not the ones in the proxy I have set. They have no effect. I have also tried:
{options:{'proxy':'98.239.198.83:21320'}}
As the docs call that object the options* object *see above^
and
'--proxy=98.239.198.83:21320'
I have also had a dig through the phantom module to find the create function. It is not written in js I can't see it at least. It must be in C++. It looks like this module has been updated but, the examples deep inside the module look like old code.
How do I do this?
EDIT:
var phantom = require('phantom');
phantom.create(function(browser){
browser.createPage(function(page){
browser.setProxy('98.239.198.83','21320','http', null, null, function(){
page.open(
'http://example.com/req.php', function() {
});});});});
This produces no error and the page gets scraped but the proxy is ignored.
Upvotes: 15
Views: 31098
Reputation: 194
Yet another solution for nodejs:
const phantomInstance = await require('phantom').create();
const page = await phantomInstance.createPage();
// get current settings:
var pageSettings = await page.property('settings');
/*
{
XSSAuditingEnabled: false,
javascriptCanCloseWindows: true,
javascriptCanOpenWindows: true,
javascriptEnabled: true,
loadImages: true,
localToRemoteUrlAccessEnabled: false,
userAgent: 'Mozilla/5.0 (Unknown; Linux x86_64) ... PhantomJS/2.1.1 Safari/538.1',
webSecurityEnabled: true
}
*/
pageSettings.proxy = 'https://78.40.87.18:808';
// update settings (return value is undefined):
await page.property('settings', pageSettings);
const status = await page.open('https://2ip.ru/');
// show IP:
var ip = await page.evaluate(function () {
var el = document.getElementById('d_clip_button');
return !el ? '?' : el.textContent;
});
console.log('IP:', ip);
It's an option to set proxy within specific page.
Upvotes: 1
Reputation: 774
I'm running PhantomJS from windows cmd and syntaxes it use looks bit different from what's I notice if you didn't put http://
PJ wont recognize the value this is complete example
var page = require('webpage').create();
page.settings.loadImages = false; //
page.settings.proxy = 'http://192.168.1.5:8080' ;
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36';
page.open('http://some.com/page', function() {
page.render('some.png');
phantom.exit();
});
Upvotes: 2
Reputation: 179
use phantom npm package and co npm package.
co(function*() {
const phantomInstance = yield phantom.create(["--proxy=171.13.36.64:808"]);
crawScheduler.start(phantomInstance);
});
Upvotes: 2
Reputation: 774
Time is going on, so PhantomJS now able to set proxy "on the fly" (even on per-page-basis): see this commit: https://github.com/ariya/phantomjs/commit/efd8dedfb574c15ddaac26ae72690fc2031e6749
Here is sample usage of new setProxy function (i did not find web page setting usage, this is general usage of proxy on instance of phantom):
https://github.com/ariya/phantomjs/blob/master/examples/openurlwithproxy.js
If you want per-page proxy, use full URL for proxy (schema, user name,password, host, port - all it the URL)
Upvotes: 3
Reputation: 3034
{ parameters: { 'proxy': 'socks://98.239.198.83:21320' } }
They didn't update their docs.
Upvotes: 4
Reputation: 1838
As for as phantom 2.0.10 version the following code is running very well in my windows machine
phantom.create(["--proxy=201.172.242.184:15124", "--proxy-type=socks5"])
.then((instance) => {
phInstance = instance;
return instance.createPage();
})
.then((page) => {
sitepage = page;
return page.open('http://newsdaily.online');
})
.then((status) => {
console.log(status);
return sitepage.property('title');
})
.then((content) => {
console.log(content);
sitepage.close();
phInstance.exit();
})
.catch((error) => {
console.log(error);
phInstance.exit();
});
Upvotes: 6
Reputation: 2836
As a side effect of trying to figure out an issue on Github for phantomjs-nodejs I was able to set a proxy as follows:
phantom = require 'phantom'
parameters = {
loadimages: '--load-images=no',
websecurity: '--web-security=no',
ignoresslerrors: '--ignore-ssl-errors=yes',
proxy: '--proxy=10.0.1.235:8118',
}
urls = {
checktor: "https://check.torproject.org/",
google: "https://google.com",
}
phantom.create parameters.websecurity, parameters.proxy, (ph) ->
ph.createPage (page) ->
page.open urls.checktor, (status) ->
console.log "page opened? ", status
page.evaluate (-> document.title), (result) ->
console.log 'Page title is ' + result
ph.exit()
The result where the proxy uses Tor was:
page opened? success
Page title is Congratulations. This browser is configured to use Tor.
Upvotes: 2
Reputation: 9
var phantom = require('phantom');
phantom.create(function (browser) {
browser.setProxy(proxyIP, proxyPort);
page.open(url, function (status) {
console.log(status);
});
},{dnodeOpts:{weak: false}});
it works fine on my windows.
Upvotes: -1
Reputation: 61892
The CoffeeScript example is a little strange, because it is the browser
that is passed into the callback of phantom.create
and not page
, but otherwise it must be compatible judging by the code.
var phantom = require('phantom');
phantom.create({
parameters: {
proxy: '98.239.198.83:21320'
}
}, function(browser){
browser.createPage(function(page){
page.open('http://example.com/req.php', function() {
...
});
});
});
Proxy settings are set during process creation, not during page opening. Although PhantomJS contains an undocumented phantom.setProxy()
function which enables you to change the proxy settings in the middle of the script. The phantom module also seems to support it.
Upvotes: 0