FIle parsing in ruby

Question

Sample file content:-

 CNN
        
The New York Times
        
Google News
        
CNET News.com
        ESPN

Code i am using:-

path = File.join(directory, bookmark.file_file_name)
 file = Nokogiri::HTML(open(path))
count = 1
file.search('//*[@href]').each do |m| 
    p m
    p m[:href]  
   rescue 
  next
end  
end

O/p for the above code:--

p m

 , #] children=[#]>

p m[:href]

http://maps.google.com/

I want to have both URL and its value. ie "feed://news.google.com/?output=rss" and "Google News"

Vlad Khomich · Accepted Answer

m.text will return the value:

h = {} #=> {}


irb(main):021:0> file.search('//*[@href]').each do |m|
irb(main):022:1* h[m[:href]] = m.text
irb(main):023:1> end
=> 0
irb(main):024:0> h
=> {"http://www.cnn.com/"=>"CNN", "http://www.nytimes.com/"=>"The New York Times", "feed://news.google.com/?output=rss"=>"Google News", "http://www.news.com/"=>"CNET News.com", "http://espn.go.com/"=>"ESPN"}
irb(main):025:0>

FIle parsing in ruby

Answers (1)

Related Questions