Reputation:
How can I search for the element containing Click Here to Enter a New Password
using Nokigiri::HTML
?
My HTML structure is like:
<table border="0" cellpadding="20" cellspacing="0" width="100%">
<tbody>
<tr>
<td class="bodyContent" valign="top">
<div>
<strong>Welcome to</strong>
<h2 style="margin-top:0">OddZ</h2>
<a href="http://mandrillapp.com/track/click.php?...">Click Here</a>
to Enter a New Password
<p>
Click this link to enter a new Password. This link will expire within 24 hours, so don't delay.
<br>
</p>
</div>
</td>
</tr>
</tbody>
</table>
I tried:
doc = (Nokogiri::HTML(@inbox_emails.first.body.raw_source))
password_container = doc.search "[text()*='Click Here to Enter a New Password']"
but this did not find a result. When I tried:
password_container = doc.search "[text()*='Click Here']"
I got no result.
I want to search the complete text.
I found there are many spaces before text " to Enter a New Password"
but I have not added any space in the HTML code.
Upvotes: 0
Views: 151
Reputation: 37517
You were close. Here's how you find the text's containing element:
doc.search('*[text()*="Click Here"]')
This gives you the <a>
tag. Is this what you want? If you actually want the parent element of the <a>
, which is the containing <div>
, you can modify it like so:
doc.search('//*[text()="Click Here"]/..').text
This selects the containing <div>
, the text of which is:
Welcome to
OddZ
Click Here
to Enter a New Password
Click this link to enter a new Password. This link will expire within 24 hours, so don't delay.
Upvotes: 0
Reputation: 31077
You can use a mix of xpath and regex, but since there's no matches
in xpath for nokogiri yet, you can implement your own as follows:
class RegexHelper
def content_matches_regex node_set, regex_string
! node_set.select { |node| node.content =~ /#{regex_string}/mi }.empty?
end
def content_matches node_set, string
content_matches_regex node_set, string.gsub(/\s+/, ".*?")
end
end
search_string = "Click Here to Enter a New Password"
matched_nodes = doc.xpath "//*[content_matches(., '#{search_string}')]", RegexHelper.new
Upvotes: 1
Reputation: 54984
Much of the text you are searching for is outside of the a
element.
The best you can do might be:
a = doc.search('a[text()="Click Here"]').find{|a| a.next.text[/to Enter a New Password/]}
Upvotes: 2
Reputation: 11459
You can try by using CSS selector. I've saved your HTML as a file called, test.html
require 'Nokogiri'
@doc = Nokogiri::HTML(open('test.html'))
puts @result = @doc.css('p').text.gsub(/\n/,'')
it returns
Click this link to enter a new Password. This link will expire within 24 hours, so don't delay.
There's a good post about Parsing HTML with Nokogiri
Upvotes: 0