TacticalGoat
TacticalGoat

Reputation: 63

Xpath getting text with mixed elements in same div

Here is some sample HTML

<div class="something">
  <p> This is a <b> Paragraph </b> with <a href="/something"> mixed </a> elements
 <p> Next paragraph....
</div>

what I tried was

//div[contains('@class','something')/text()

and

//div[contains('@class','something')/*/text()

and

//div[contains('@class','something')/p/text()

all of these seem to skip the 'b' tags and the 'a' tags.

Upvotes: 1

Views: 746

Answers (3)

xruptronics
xruptronics

Reputation: 2975

Try " ".join(sel.xpath("//div[contains(@class,'something')]//text()").extract()) where sel is selector in your case may be response.

Upvotes: 3

Tom&#225;š Linhart
Tom&#225;š Linhart

Reputation: 10220

It depends on what and how you want to obtain. Anyway, there are couple of problems with what you tried:

  • You are missing closing bracket (]) after contains in the XPath expression.
  • @class should not be enclosed in (single) quotes when used inside contains.

If you want to get all the text of div element as one string, you might use

normalize-space(//div[contains(@class,'something')])

Upvotes: 1

zx485
zx485

Reputation: 29052

Use the XPath expression

//div[contains(@class,'something')]//text()

to get a concatenation of the text of all the text() nodes in the chosen div element.

Output:

This is a  Paragraph  with  mixed  elements  
Next paragraph....

Upvotes: 2

Related Questions