Reputation: 28632
I have an HTML document like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<title>Page Title</title>
<style type="text/css">
</style>
</head>
<body>
<div class="section">
<table>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
</table>
</div>
<div class="section">
<table>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
<tr>
<td>test</td><td>test</td><td>test</td><td>test</td>
</tr>
</table>
</div>
<div class="section">
<table>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
<tr>
<td>this_is_what_i_want</td><td>this_is_what_i_want</
td><td>test</td><td>test</td>
</tr>
</table>
</div>
</body>
</html>
I want to get the first two td
elements in all rows of the first and
third table
element. How to get this result?
Note that the two td
elements in a row have some relation and you can't treat all td
elements the same way. For example, how do I concatenate the content of
the two td
elements in a row?
Upvotes: 1
Views: 298
Reputation: 37527
It can also be done using two XPath statements:
doc.xpath('//div[position()=1 or position()=3]/table/tr').map {|row| row.xpath('concat(//td[1]," ",//td[2])')}
The reason it can't be done in a single XPath statement is that the String XPath functions work on the first node of a nodeset only. You can do node selection or concatenation but not both.
Note that in XPath 2.0, it can be done using the string-join()
function but Nokogiri supports only XPath 1.0.
Upvotes: 2
Reputation: 55002
doc.xpath('//div[position()=1 or position()=3]/table/tr').map{|tr| tr.css('td')[0..1].map(&:text).join(' ')}
Upvotes: 2