gongarek
gongarek

Reputation: 1034

How stop on specific tag?

How get whole text under h1 tag to the next h1 tag?

I have class name of starting h1 tag

...
<h1 class="something">...</h1>
...
<h1 ...>...</h1>
...

I tried: //*[@class='something']//text()

I want to scrapy text from all childs and siblings. I don't need text of h1 tags. I don't know how to stop scraping to next h1 tag.

Upvotes: 0

Views: 225

Answers (1)

Alejandro
Alejandro

Reputation: 1882

With a proper example:

<root>
  <h1 class="something">.1.</h1>
  .2.
  <p>.3.</p>
  .4.
  <h1 class="other">.5.</h1>
</root>

This XPath 1.0 expression:

/root//text()[not(ancestor::h1)][preceding::h1[1][@class='something']]

Meaning: "descendants text nodes of root element having the first preceding h1 element with @class attribute equal to 'something´ and not having an ancestor h1 element"

And it selects

.2.

.3.
.4.

Test in http://www.xpathtester.com/xpath/ecd4f379b13558572ffd62d0db3a3f98

Upvotes: 3

Related Questions