Reputation: 11678
I'm using HtmlUnits over Java to scrape my website.
I have the following div inside my HTML page:
<div class="items">
<div class="item_title">
<span class="title">TEXT</span>
</div>
</div>
I got the HtmlDivision
object that contains the full div.
I'm trying to get to the span
element using the
List<?> titleSpans = div.getByXPath("/span");
But this returns all the spans in the page.
How can I search for span
elements that are only in this single HtmlDivision
element?
Upvotes: 1
Views: 189
Reputation: 6168
In XPath, regardless of what technology you use, a single slash (/
) represents a full path from the root. In contrast, a double slash (//
) is relative. Regardless if it is the first child under the current node, if you do not want to express a full path to the desired element, you must use a relative path. For you, that's "//span"
.
If you want to use a more specific (relative) path, use a predicate. For example, "//span[@class='title']"
UPDATE: To limit relative path to the current node, use the dot (.
) before the double slash. For example ".//span[@class='title']"
Upvotes: 1