Remove whitespace with strip method in python in scrapy script, ways to avoid the none in extract

Question

the strip method return none if is empty and i would like to know the better way to do it

import scrapy

class GamesSpider(scrapy.Spider):
    name = "games"
    start_urls = [
        'myurl',
    ]

    def parse(self, response):
        for game in response.css('ol#products-list li.item'):
            yield {
                'name': game.css('h2.product-name a::text').extract_first().strip(),
                'age': game.css('.list-price ul li:nth-child(1)::text').extract_first().strip(),
                'players': game.css('.list-price ul li:nth-child(2)::text').extract_first().strip(),
                'duration': game.css('.list-price ul li:nth-child(3)::text').extract_first().strip(),
                'dimensions': game.css('.list-price ul li:nth-child(4)::text').extract_first().strip()
            }

Rom · Accepted Answer

Document of Scrapy (https://doc.scrapy.org/en/latest/intro/tutorial.html) said:

using .extract_first() avoids an IndexError and returns None when it doesn’t find any element matching the selection.

So some extracts return None, not is a string, so it raised error object no attribute strip(). You should handle it when None value is returned.

Remove whitespace with strip method in python in scrapy script, ways to avoid the none in extract

Answers (2)

Related Questions