Reputation: 63

Xpath getting text with mixed elements in same div

Here is some sample HTML

<div class="something">
  <p> This is a <b> Paragraph </b> with <a href="/something"> mixed </a> elements
 <p> Next paragraph....
</div>

what I tried was

//div[contains('@class','something')/text()

and

//div[contains('@class','something')/*/text()

and

//div[contains('@class','something')/p/text()

all of these seem to skip the 'b' tags and the 'a' tags.

Upvotes: 1

Answers (3)

Reputation: 2975

Try " ".join(sel.xpath("//div[contains(@class,'something')]//text()").extract()) where sel is selector in your case may be response.

Upvotes: 3

Reputation: 10220

It depends on what and how you want to obtain. Anyway, there are couple of problems with what you tried:

If you want to get all the text of div element as one string, you might use

normalize-space(//div[contains(@class,'something')])

Upvotes: 1

Reputation: 29052

Use the XPath expression

//div[contains(@class,'something')]//text()

to get a concatenation of the text of all the text() nodes in the chosen div element.

Output:

This is a  Paragraph  with  mixed  elements  
Next paragraph....

Upvotes: 2