jianbing Ma
jianbing Ma

Reputation: 375

How could xpath extract these contents?

There have a pice of HTML code like this. How could I get the title content?

<a class="question_link" href="/n/1639322" target="_blank">
<div class="question_text_icons">
<span></span>
</div>
"
This is the page title, which I want to get.
"
</a>

my xpath is

//a[@class="question_link"]/text()

but the output was

"\n"
"\nThis is the page title, which I want to get.\n"

I only want to "This is the page title, which I want to get.".

Upvotes: 1

Views: 30

Answers (2)

har07
har07

Reputation: 89325

Another possible option is, by using normalize-space() in predicate to filter out empty text nodes :

//a[@class="question_link"]/text()[normalize-space()]

Upvotes: 2

alecxe
alecxe

Reputation: 474181

One option would be to locate the inner div and get the following text sibling:

//a[@class="question_link"]/div[@class="question_text_icons"]/following-sibling::text()

Or, get the last text node:

//a[@class="question_link"]/text()[last()]

Upvotes: 0

Related Questions