Reputation: 143
I need to get the data from html but response.css, response.xpath and combination is not working whenever I tried to get the "regular-price" data it always says "none"
I need to get the value text of enter code here
which $17.99
here's my code
HTML
<div class="price parbase"><div class="primary-row product-item-price product-item-price-discount">
<span class="price-value">$12.99</span><small class="js-price-value-original price-value-original">$17.99</small>
</div>
</div>
Scrapy python
def parse_subpage(self, response):
item = {
'title': response.css('h1.primary.product-item-headline::text').extract_first(),
'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(),
'regular-price': response.css('.js-price-value-original').xpath("@small").extract_first(),
'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
'description': response.css('p.pdp-description-text::text').extract_first()
}
yield item
output should be regular-price: $17.99
please help thank you!
Upvotes: 0
Views: 438
Reputation: 1383
Thanks @vezunchik. If you want to use CSS selector. You can use the below code
response.css('script:contains("whitePrice")').re_first("'whitePrice'\s?:\s?'([^']+)'")
Upvotes: 0
Reputation: 3717
Your link gives me 404, but by your html snippet you need only response.css('small.js-price-value-original::text').get()
, there is no attribute small
.
UPD: Hm, seems this data is rendered by JS. Check html code of page and you will see huge json, search by whitePrice
keyword. You can retrieve such data, forxample with response.xpath('//script[contains(text(), "whitePrice")]/text()').re_first("'whitePrice'\s?:\s?'([^']+)'")
Upvotes: 1
Reputation: 406
If this sniped is the only html you have, you can do:
def parse_subpage(self, response):
item = {
'title': response.css('h1.primary.product-item-headline::text').extract_first(),
'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(),
'regular-price': response.xpath('//div/small[contains(@class, "js-price-value-original") and contains(@class, "price-value-original")]/text()').extract_first(),
'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
'description': response.css('p.pdp-description-text::text').extract_first()
}
yield item
Btw, the website you provided shows a file not found
Upvotes: 0