Reputation: 963
I have an xml like the following.
<div id="test">
<div id="mw-normal-catlinks" class="mw-normal-catlinks">
<a href="/wiki/Help:Category" title="Help:Category">Categories</a>:
<ul>
<li>
<a href="/wiki/Category:1961_births" title="Category:1961 births">1961 births</a>
</li>
<li>
<a href="/wiki/Category:Gadjah_Mada_University_alumni" title="Category:Gadjah Mada University alumni">Gadjah Mada University alumni</a>
</li>
</ul>
</div>
<div id="mw-hidden-catlinks" class="mw-hidden-catlinks mw-hidden-cats-hidden">
<ul>
<li>
<a href="/wiki/Category:Pages_using_web_citations_with_no_URL" title="Category:Pages using web citations with no URL">Pages using web citations with no URL</a>
</li>
<li>
<a href="/wiki/Category:CS1_Indonesian-language_sources_(id)" title="Category:CS1 Indonesian-language sources (id)">CS1 Indonesian-language sources (id)</a>
</li>
</ul>
</div>
</div>
I want to extract the categories "1961 births", "Gadjah Mada University alumni" from [div id="mw-normal-catlinks"].
If I use the following xpath, I get what I want but the xpath also extracts "Pages using web citations with no URL", and "CS1 Indonesian-language sources (id)" from the [div id="mw-hidden-catlinks"].
//a[contains(@href,"/wiki/Category")]
Using the xpath below I get no result.
//DIV[@id="mw-normal-catlinks"]/a[contains(@href,"/wiki/Category")]
Anyone can help me with the correct xpath?
Upvotes: 0
Views: 40
Reputation: 1105
This shall do: .//div[@id="mw-normal-catlinks"]/ul//a
. It returns both the a
tags 1961 births
, Gadjah Mada University alumni
from div[@id="mw-normal-catlinks"]
.
Upvotes: 2