kxhitiz
kxhitiz

Reputation: 1949

Load any page fully without rendering on browser

I need to render page fully without actually loading page in browser and read the content as string. Like the actual final page text after all dom manipulation is done by js. Can you guys suggest me solution to this or any other tool I can use?

I am on ruby on rails framework.

Upvotes: 1

Views: 989

Answers (3)

SoAwesomeMan
SoAwesomeMan

Reputation: 3406

1) install PhantomJS so that available via command line on your operating system

2)

# config/application.rb
module YourApp
  class Application < Rails::Application
    config.after_initialize do
      require Rails.root.join('lib/page_to_s.rb')
    end
  end
end


# lib/page_to_s.rb
require 'tempfile' # see: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/tempfile/rdoc/Tempfile.html
module PageToS
  extend self
  def get(url)
    file = ::Tempfile.new('page_to_s.js')
    begin
      # http://techslides.com/grabbing-html-source-code-with-phantomjs-or-casperjs/
      file.write("var page = require('webpage').create();page.open('#{url}', function (status) {var js = page.evaluate(function () {return document;});console.log(js.all[0].outerHTML); phantom.exit();});")
      file.close
      `phantomjs #{file.path}`
    ensure
      file.unlink
    end
  end
end

# anywhere
str = PageToS.get('http://localhost:3000/any_page')

example usage

Upvotes: 0

tekrat
tekrat

Reputation: 115

Here are few ways I can think of doing this:

  • Use the MSHTML COM object in RoR
    • If you're on a Windows box and your install of RoR can call COM/ActiveX object you could instantiate an MSHTML object, render the page and grab the content
  • Write a NodeJS server
    • You can pull the same trick using NodeJS to render a page in memory and serve the content as web service to your RoR instance
  • Write a Node-Webkit server
    • The same idea as above but you'll have direct access to the WebKit rendering engine

All of these can work but you're looking to added at least a second of load time each time you call one of these process. Also you're in essence making a mini version of web browser which can be a memory hog and may affect the long term stability of your server.

Upvotes: 0

ironsand
ironsand

Reputation: 15161

As adeneo suggested headless browser is what you want.

Like phantomjs and selenium-webdriver gem.

Upvotes: 1

Related Questions