Reputation: 7160
I need to get the avatar's src attribute from Facebook.
doc = Nokogiri::HTML(open('http://www.facebook.com/zuck'))
Then I tried:
avatar = doc.css('.photoContainer img')
But received an empty result. What should I do to get the img src? And why didn't my method work?
I also tried to find all imgs by XPath, but still received empty results:
Nokogiri::HTML(open('http://www.facebook.com/zuck')).xpath("//img/@src").each do |src|
puts src
end
Upvotes: 0
Views: 2678
Reputation: 27374
The problem is that the .photoContainer
div that you're trying to access is not in the actual HTML for the page, it is inserted into the DOM via JavaScript so Nokogiri can't see it. Nokogiri can only parse static HTML and XML.
If you want to access the DOM content generated by JavaScript, you might want to try an automated web browsing tool like watir or selenium. Also see "Nokogiri parse ajax-loaded content".
UPDATE:
If you're familiar with integration testing using capybara, you can also use its selectors as a wrapper for a browsing tool like selenium, which can be a bit tricky to use directly.
So, for example, in a console:
require 'capybara'
require 'capybara/dsl'
include Capybara::DSL
Capybara.default_driver = :selenium
Then you can get the element, first by closing the pop-up, and then accessing the element via CSS:
visit('http://www.facebook.com/zuck')
find('a.layerCancel').click
find('.photoContainer img')['src']
#=> "http://profile.ak.fbcdn.net/hprofile-ak-ash3/c23.1.285.285/s160x160/73273_773684942011_2125564_n.jpg"
Upvotes: 2