Pramod
Pramod

Reputation: 652

How to get the text from child nodes if it is parents to other node in Scrapy using XPath

I am facing a problem where I have to get the result from the child node which may or may not be parents to some other node using Xpath in scrapy. consider the case like

<h1 class="main">
 <span class="child">data</span>
</h1>

or

<h1 class="main">
<span class="child">
 <span class="child2">data</span>
</span>
</h1>

My solution was response.xpath(".//h1[@class='main']/span/text()").extract()

Upvotes: 5

Views: 1796

Answers (2)

Anzel
Anzel

Reputation: 20563

use //text, and it will return all text elements in a list from within your span, both parent and child:

response.xpath(".//h1[@class='main']/span//text()").extract()

Upvotes: 3

paul trmbrth
paul trmbrth

Reputation: 20748

You can use:

  • response.xpath("string(.//h1[@class='main']/span)").extract()
  • or even response.xpath("string(.//h1[@class='main'])").extract() if you're after the whole header text

Upvotes: 1

Related Questions