JavaSheriff
JavaSheriff

Reputation: 7675

jsoup fetch element proceeding specific element

Is it possible to get proceeding element using jSoup by knowing the previous element?

for example in this html, I have the data of table "Given this item"
I would like to fetch the next table containing "Looking for this"

<table><tr><td>irrelevant info 1 <a href="http://jsoup.org/">jsoup</a></td></tr></table>
<p>there is a p here</p>
<table><tr><td>Given this item <a href="http://jsoup.org/">jsoup</a></td></tr></table>
<p>there is a p here</p>
<table><tr><td>Looking for this <a href="http://jsoup.org/">jsoup</a></td></tr></table>
<p>there is a p here</p>
<table><tr><td>irrelevant info 2<a href="http://jsoup.org/">jsoup</a></td></tr></table>
<p>there is a p here</p>
<table><tr><td>irrelevant info 3 <a href="http://jsoup.org/">jsoup</a></td></tr></table>

example: http://try.jsoup.org/~vtmUE0bVgNHSxdvpKcIzpL3pHEA

Upvotes: 0

Views: 109

Answers (3)

Eritrean
Eritrean

Reputation: 16498

Alternatively you can use list.indexOf

Elements tables = doc.select("table");// returns a list of all table elements
Element given = doc.select("table:contains(Given this item)").first(); //yor given element
Element required = tables.get(tables.indexOf(given)+1);//index of given + 1 = index of required element

Upvotes: 1

JavaSheriff
JavaSheriff

Reputation: 7675

Thank you TDG

According to jsoup cookbook

siblingA ~ siblingX: finds sibling X element preceded by sibling A, e.g. h1 ~ p

So I ended up doing:

table:contains(Given this item)  ~ table

And then I took e.first()

Upvotes: 1

TDG
TDG

Reputation: 6151

The way your HTML is structured is like this:
If we use the following selector - Element e = doc.select("tr:contains(Given this item)").first(); we will get

<tr>
 <td>Given this item <a href="http://jsoup.org/">jsoup</a></td>
</tr>

Now, the parent of this element, given by e.parents().first() is

<tbody>
 <tr>
  <td>Given this item <a href="http://jsoup.org/">jsoup</a></td>
 </tr> 
</tbody>

And its parent - e.parents().first().parents().first() is

<table>
 <tbody>
  <tr>
   <td>Given this item <a href="http://jsoup.org/">jsoup</a></td>
  </tr> 
 </tbody>
</table>`

And now you can get your sibling like that - e.parents().first().parents().first().nextElementSibling() which results with

<table>
 <tbody>
  <tr>
   <td>Looking for this <a href="http://jsoup.org/">jsoup</a></td>
  </tr> 
 </tbody>
</table>

But its pretty ugly... so instead you can query for Elements e = doc.select("table:contains(Given this item)"); to get

<table>
 <tbody>
  <tr>
   <td>Given this item <a href="http://jsoup.org/">jsoup</a></td>
  </tr> 
 </tbody>
</table>`

and then the element you are looking for is e.first().nextElementSibling().

Upvotes: 1

Related Questions