Reputation: 3167
With xPath I am trying to get the following values:
html:
<ul class="listVideoAttributes alpha only">
<li class="alpha only">
<span>Categories:</span>
<ul>
<li class="psi alpha">
<a href="#">Cinema</a>
</li>
<li class="omega">
<a href="#">HD</a>
</li>
</ul>
</li>
</ul>
Categories are not always named as categories, sometimes they call it Tags
.
I would like the following xPath to locate Categories and get the category values like Cinema and HD.
For now, I'm using:
//ul[@class="listVideoAttributes"][contains(., 'Categories:')]
and it returns values but also the text 'categories:'.
I would like to do something like:
//ul[@class="listVideoAttributes"][contains(., 'Categories:')]/ul
But it seems not to work.
Upvotes: 1
Views: 923
Reputation: 38732
Your XPath expresion did not work, because the inner <ul/>
is not direct child of the outer <ul/>
. Use the descendant-or-self axis step //ul
instead of the child axis step /ul
at the end of your expression. If you're sure the markup will not change, better only use child axis steps: /li/ul/li/a
.
Another problem is that the @class
attribute does not equal listVideoAttributes
, but only contain it. You should never compare HTML-class-attributes with equals, always use contains.
Anyway, I'd be as specific as possible while searching for the "headline", otherwise you could find false positives when the content of any "listVideoAttributes"-list contains one "Categories" or "Tags":
//ul[contains(@class, 'listVideoAttributes')]/li[contains(span, 'Categories') or contains(span, 'Tags')]//a
You might want to add a /text()
if you cannot read the string value from the programming language you're using which would usually be preferred (eg., when a link contains bold text like <a href="..."><strong>foo</strong><a>
; text()
wouldn't return the string value in this case.
Upvotes: 1
Reputation: 122424
There are two problems with
//ul[@class="listVideoAttributes"][contains(., 'Categories:')]/ul
first the outer ul
class is not equal to "listVideoAttributes", it only contains that as a substring, and secondly the inner ul
is not a direct child of the outer one, it's a grandchild. How about
//ul[contains(@class, 'listVideoAttributes')][contains(., 'Categories')]/li/ul/li/a
Upvotes: 0
Reputation: 118299
You can try the below Xpath
//ul[contains(@class,'listVideoAttributes') and contains(.//span,'Categories')]//a/text()
output:
Cinema
HD
Upvotes: 0