Reputation: 261
I'm scraping a web and want to get the price information of all products on the first page. Below is the html of the web. I want to get 99.
<div class = 'item-bg'>
<div class = 'product-box'>
<div class = 'res-info'>
<div class = 'price-box'>
<span class = 'def-price selectorgadget_rejected'>
<i>$</i>
99
<i>.99</i>
</span>
</div>
</div>
</div>
</div>
I don't think I can use the def-price class because some products have 'selectorgadget_rejected' and some products have 'selectorgadget_suggested' after it. My code right now is
product_info = response.css('.item-bg')
for product in product_info:
product_price_sn = product.css('.price-box').extract()
It's not getting 99 and I'm not sure how to fix it. Any ideas?
Here is the screenshot of the full HTML info:
Upvotes: 2
Views: 850
Reputation: 2564
I always prefer to use XPath over CSS. In XPath you could use the contains
function to specify which classes you want to select, like:
response.xpath('//span[contains(@class, "def-price selectorgadget")]//text()').extract()
<span>
tags in the page which it's class contained the expression def-price selectorgadget
wheter it be selectorgadget_rejected
or selectorgadget_suggested
.Or using the pre-selected product_info
:
product_info = response.css('.item-bg')
for product in product_info:
product_price_sn = product.xpath('div/div/div/span[contains(@class, "def-price selectorgadget")]//text()').extract()
Using full path because only snippet of HTML was posted
If you want only the 99 outside the <i>
tags use /text()
instead of //text()
Now, in case you want to stick with the CSS selectors, this might work:
product.css('.price-box span::text').extract()
Upvotes: 1