earlyadopter
earlyadopter

Reputation: 1547

watir-webdriver: how to retrieve entire line from HTML for which I found substring in it?

I've got something like that in HTML coming from server:

<html ...>
<head ...>
....
<link href="http://mydomain.com/Digital_Cameras--~all" rel="canonical" />

<link href="http://mydomain.com/Digital_Cameras--~all/sec_~product_list/sb_~1/pp_~2" rel="next" />
...
</head>
<body>
...
</body>
</html>

If b holds the browser object navigated to the page I need to look through, I'm able to find rel="canonical" with b.html.include? statement, but how could I retrieve the entire line where this substring was found? And I also need the next (not empty) one.

Upvotes: 1

Views: 422

Answers (2)

Justin Ko
Justin Ko

Reputation: 46846

You can use a css-locator (or xpath) to get link elements.

The following would return the html (which would be the line) for the link element that has the rel attribute value of "canonical":

b.element(:css => 'link[rel="canonical"]').html
#=> <link href="http://mydomain.com/Digital_Cameras--~all" rel="canonical" />

I am not sure what you mean by "I also need the next (not empty) one.". If you mean that you want the one with rel attribute value of "next", you can similarly do:

b.element(:css => 'link[rel="next"]').html
#=> <link href="http://mydomain.com/Digital_Cameras--~all/sec_~product_list/sb_~1/pp_~2" rel="next" />

Upvotes: 5

orde
orde

Reputation: 5283

You could use String#each_line to iterate through each line in b.html and check for rel=:

b.goto('http://www.iana.org/domains/special')
b.html.each_line {|line| puts line if line.include? "rel="}

That should return all strings including rel= (although it could return lines that you don't want, such as <a> tags with rel attributes).

Alternately, you could use nokogiri to parse the HTML:

require 'nokogiri'
require 'open-uri'

doc = Nokogiri::HTML(open("http://www.iana.org/domains/special"))
nodes = doc.css('link')
nodes.each { |node| puts node}

Upvotes: 0

Related Questions