Reputation: 13
Trying to pull the product name from a page:
https://www.v12outdoor.com/view-by-category/rock-climbing-gear/rock-climbing-shoes/mens.html
Can't find XPATH which returns useful, specific result.
Apologies for my first post being such a beginner question :(
class V12Spider(scrapy.Spider):
name = 'v12'
start_urls = ['https://www.v12outdoor.com/view-by-category/rock-climbing-gear/rock-climbing-shoes/mens.html']
def parse(self, response):
yield {
'price' : response.xpath('//span[@id="product-price-26901"]/text()'),
'name' : response.xpath('//h3[@class="product-name"]/a/text()'),
}
for name
, I expected to produce the name from items in h3
tags with class class product-name
but generates multiple rows of data='\r\n
(whilst we're at it for price
, is there any way to only pull the numerical values out?)
Upvotes: 1
Views: 51
Reputation: 843
The problem you are facing can be solved using get() method for xpath and then using strip() method for string. I tried something like this
name= response.xpath('//h3[@class="product-name"]/a/text()').get()
Gives
'\r\n RED CHILLI VOLTAGE '
Then using
name.strip()
gives
'RED CHILLI VOLTAGE'
So you can replace your name statement with
name= response.xpath('//h3[@class="product-name"]/a/text()').get().strip()
Same solution to get price just add .get().strip at the end of your statement
Hopefully this helps. Also read about .getall() method from https://docs.scrapy.org/en/latest/topics/selectors.html
Upvotes: 1