Reputation: 1949
I need to render page fully without actually loading page in browser and read the content as string. Like the actual final page text after all dom manipulation is done by js. Can you guys suggest me solution to this or any other tool I can use?
I am on ruby on rails framework.
Upvotes: 1
Views: 989
Reputation: 3406
1) install PhantomJS so that available via command line on your operating system
2)
# config/application.rb
module YourApp
class Application < Rails::Application
config.after_initialize do
require Rails.root.join('lib/page_to_s.rb')
end
end
end
# lib/page_to_s.rb
require 'tempfile' # see: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/tempfile/rdoc/Tempfile.html
module PageToS
extend self
def get(url)
file = ::Tempfile.new('page_to_s.js')
begin
# http://techslides.com/grabbing-html-source-code-with-phantomjs-or-casperjs/
file.write("var page = require('webpage').create();page.open('#{url}', function (status) {var js = page.evaluate(function () {return document;});console.log(js.all[0].outerHTML); phantom.exit();});")
file.close
`phantomjs #{file.path}`
ensure
file.unlink
end
end
end
# anywhere
str = PageToS.get('http://localhost:3000/any_page')
Upvotes: 0
Reputation: 115
Here are few ways I can think of doing this:
All of these can work but you're looking to added at least a second of load time each time you call one of these process. Also you're in essence making a mini version of web browser which can be a memory hog and may affect the long term stability of your server.
Upvotes: 0
Reputation: 15161
As adeneo suggested headless browser is what you want.
Like phantomjs
and selenium-webdriver
gem.
Upvotes: 1