Reputation: 31248
Is there an efficient way to get an HTML element tag path of all the open but not closed tags with jSoup?
E.g. if the HTML is
<!DOCTYPE html>
<html>
<head>...</head>
<body>
<section id="secID">
<div class="divClass">
<section id="subSection">
<h3>Heading</h3>
<ul class="list">
<li>
when I get to li
, I want its path to be html->body->section->div->section->ul
Upvotes: 0
Views: 1172
Reputation: 11712
To get the list of 'open' elements, you can simply use the Element.parents() method. If you want to get the list starting with root element, you must reorder the returned list, but that should be trivial to achieve.
Upvotes: 1
Reputation: 17745
I believe a good way would be to check if the element you are on has children via children() method see here . If it has you put that element in a list and continue with it's first child and do the same and then the next one and so on. When there isn't any one left you have your list. It's a recursive idea, you will do the same with the second child and so on.
EDIT A bit of explanation
Let's say you are on html tag. Call children(). Take the list returned and begin. First element call children(). Returns list. First element call children etc. When you stop (no children) then you go up (father element) and continue with second child. It ends when you have visited all nodes of the initial list (from html element). It's a recursive idea so the efficiency is compromised, but it's solid.
<html> <--- head , body
<head>text</head> <---just text node so no elements
<body> <--- Second child of html. ul
<ul> <--- Empty no elements. go to father element.
</ul>
</body>
</html>
Upvotes: 1