Reputation: 3217
I want to select all text inside a div
without considering tags inside.
<div>
<p>some text here <a href="">a link here <span>span here<span></a></p>
</div>
I need to get the result as
some text here a link here span here
I tried this
response.xpath('//div/text()')
Upvotes: 5
Views: 5239
Reputation: 111726
You're asking for the string-value of that div
:
string(/div)
Or, if you wish whitespace to be trimmed from the ends and consolidated internally:
normalize-space(/div)
Upvotes: 5
Reputation: 10666
Try to string()
it with XPath:
response.xpath('string(//div)').extract_first()
Upvotes: 2
Reputation: 941
check the following code for clarification
response.xpath('//div//text()')
and try the following for the required output
" ".join([i.strip() for i in tree.xpath('//div//text()') if i.strip()])
Upvotes: 0