Agus Sanjaya
Agus Sanjaya

Reputation: 963

Finding href link which contains text

I have an xml like the following.

<div id="test">    
    <div id="mw-normal-catlinks" class="mw-normal-catlinks">
        <a href="/wiki/Help:Category" title="Help:Category">Categories</a>:
        <ul>
            <li>
                <a href="/wiki/Category:1961_births" title="Category:1961 births">1961 births</a>
            </li>
            <li>
                <a href="/wiki/Category:Gadjah_Mada_University_alumni" title="Category:Gadjah Mada University alumni">Gadjah Mada University alumni</a>
            </li>
        </ul>
    </div>
    <div id="mw-hidden-catlinks" class="mw-hidden-catlinks mw-hidden-cats-hidden">
        <ul>
            <li>
                <a href="/wiki/Category:Pages_using_web_citations_with_no_URL" title="Category:Pages using web citations with no URL">Pages using web citations with no URL</a>
            </li>
            <li>
                <a href="/wiki/Category:CS1_Indonesian-language_sources_(id)" title="Category:CS1 Indonesian-language sources (id)">CS1 Indonesian-language sources (id)</a>
            </li>
        </ul>
    </div>
</div>

I want to extract the categories "1961 births", "Gadjah Mada University alumni" from [div id="mw-normal-catlinks"].

If I use the following xpath, I get what I want but the xpath also extracts "Pages using web citations with no URL", and "CS1 Indonesian-language sources (id)" from the [div id="mw-hidden-catlinks"].

//a[contains(@href,"/wiki/Category")]

Using the xpath below I get no result.

//DIV[@id="mw-normal-catlinks"]/a[contains(@href,"/wiki/Category")]

Anyone can help me with the correct xpath?

Upvotes: 0

Views: 40

Answers (1)

hiren
hiren

Reputation: 1105

This shall do: .//div[@id="mw-normal-catlinks"]/ul//a. It returns both the a tags 1961 births, Gadjah Mada University alumni from div[@id="mw-normal-catlinks"].

Upvotes: 2

Related Questions