Scraping data between two spans scrapy

Question

I'm scraping a web and want to get the price information of all products on the first page. Below is the html of the web. I want to get 99.

I don't think I can use the def-price class because some products have 'selectorgadget_rejected' and some products have 'selectorgadget_suggested' after it. My code right now is

product_info = response.css('.item-bg')
for product in product_info:
    product_price_sn = product.css('.price-box').extract()

It's not getting 99 and I'm not sure how to fix it. Any ideas?

Here is the screenshot of the full HTML info:

renatodvc · Accepted Answer

I always prefer to use XPath over CSS. In XPath you could use the contains function to specify which classes you want to select, like:

response.xpath('//span[contains(@class, "def-price selectorgadget")]//text()').extract()

This would extract text from ALL the tags in the page which it's class contained the expression def-price selectorgadget wheter it be selectorgadget_rejected or selectorgadget_suggested.

Or using the pre-selected product_info:

product_info = response.css('.item-bg')
for product in product_info:
    product_price_sn = product.xpath('div/div/div/span[contains(@class, "def-price selectorgadget")]//text()').extract()

Using full path because only snippet of HTML was posted

If you want only the 99 outside the tags use /text() instead of //text()

CSS Selector

Now, in case you want to stick with the CSS selectors, this might work:

product.css('.price-box span::text').extract()

Scraping data between two spans scrapy

Answers (1)

CSS Selector

Related Questions