ThomasReggi
ThomasReggi

Reputation: 59345

Access raw "page.content" from phantom module

I am using the phantom module with Node.js not the phantomjs runtime.

How do I access page.content?

The example below does not work.

var phantom = require('phantom')

phantom.create(function (ph) {
  ph.createPage(function (page) {
    page.open('http://www.google.com', function (status) {
      console.log(status) // -> success
      console.log(page.content) // -> undefined
      console.log(page.getContent()) // -> undefined
      ph.exit()
    })
  })
})

Upvotes: 2

Views: 201

Answers (2)

Artjom B.
Artjom B.

Reputation: 61892

Since the phantom module (bridge between node.js and PhantomJS) is asynchronous in nature, the API is a little different from plain PhantomJS. The differences are described on the project page, particularly in the Functional Details:

Properties can't be get/set directly, instead use page.get('version', callback) or page.set('viewportSize', {width:640,height:480}), etc. Nested objects can be accessed by including dots in keys, such as page.set('settings.loadImages', false)

In your case that would be

page.get("content", function(content){
    console.log(content);
});

This should give you the complete DOM. See my post here for different ways of getting different representations of the DOM.

Upvotes: 2

Vishnu
Vishnu

Reputation: 12283

var phantom = require('phantom')

phantom.create(function (ph) {
  ph.createPage(function (page) {
    page.open('http://www.google.com', function (status) {
      if (status !== 'success') {
        console.log('Unable to access the network!')
      } else {
        page.evaluate(function () {
          return document.body
        }, function (body) {
          console.log(body)
          ph.exit()
        })
      }
    })
  })
})

Upvotes: -1

Related Questions