Reputation: 27
I keep getting HTML as well as the text I want in Xpath I am running and can't work out how to stop it as i just want the text.
The Xpath
hxs.xpath('//h1[@class="body2"]').extract()
The HTML
<div class="product-title cf">
<h1 itemprop="name" class="body2">
Cornish Ale Dozen - Case of 12
</h1>
</div>
Any suggestions would be appreciated thanks
Upvotes: 1
Views: 42
Reputation: 89285
Pure XPath instruction to get text nodes instead of the parent element would be as follow :
//h1[@class="body2"]/text()
Particularly, using the above XPath should work as you expected, assuming that the library being used to execute the XPath is Scrapy.
Upvotes: 1