Reputation: 1479
I am using the nokogiri gem to parse an html table content in which I have a column with a list of names and some of those names are hyperlinked and some are not. When I use this code:
puts doc.xpath("//table//tr//td[1]/text()")
It skips the hyperlinked names. I can also get the hyperlinked names with this:
doc.xpath('//table//tr//td[1]//a[@href]').each do |link|
puts link.text.strip
end
How can I get all names without having to do it twice?
Upvotes: 2
Views: 423
Reputation: 37527
If you want all text in the cell, hyperlinked or not:
doc.xpath('//td[1]').each do |cell|
puts cell.text.strip
end
Note: in a valid HTML document, a td
will always be within a table
and a tr
. If you don't have any other selector requirements, you can simplify as above.
Upvotes: 1