dMazay
dMazay

Reputation: 65

Xpath Get text after first html tag

There are next block

<div class="text">
  <h1>head1</h1>
    Text1 <br/><br/> text12  <br/><br/> text 13
  <h1>head11</h1>
    Text11
  <h3>head3</h3>
    Text2
</div>

How to get text after first H1 with ignore <br/><br/> as

Text1 
text12
text 13

I use Grab Python page = g.doc.select('//div[@class="text"]/h3[1]/following-sibling::text()]') Result is

Text1
text12
text 13
Text11
Text2

Upvotes: 2

Views: 604

Answers (1)

Daniel Haley
Daniel Haley

Reputation: 52848

You could try selecting the text() that only has one preceding h1 sibling...

//div[@class='text']/text()[count(preceding-sibling::h1)=1]

Another alternative is to try using the Kayessian method...

//div[@class='text']/h1[1]/following-sibling::text()[count(.|//div[@class='text']/h1[1+1]/preceding-sibling::text()) = count(//div[@class='text']/h1[1+1]/preceding-sibling::text())]

Here's a better example and explanation of the Kayessian method.

Upvotes: 1

Related Questions