Reputation: 156
I checked for similar questions and but I couldn't find answer for mine.
I need to collect the text value comes inside a h1 tag, as per the example value "text1", which comes in 3 different situation. I am sharing all 3 html codes below:
First Case:
<h1 class="h1">
text1
<br>
<span>text2</span>
</h1>
Second Case:
<h1 class="h1">
<span>text1</span>
</h1>
Third Case:
<h1 class="h1">
<br>
text1
<span>text2</span>
</h1>
I used the xpath
//h1[@class="h1"]/text()[1]|//h1[@class="h1"]/span[1]
But it select the <br>
tag in the third case. Is there anyway, I can ignore the break tag and get the text1
value in all 3 cases?
Upvotes: 0
Views: 609
Reputation: 29022
Try this:
//h1/descendant-or-self::text()[normalize-space()][1]
It selects the first descending text node of h1
that is not empty or contains only whitespace.
Upvotes: 1