oyerohabib
oyerohabib

Reputation: 395

how can i default values for scraped result when they have a return of null/none

I have scraped some informations from a website in which some outputs are not present and it returns null. is there a way to output a default value in such case for different fields. The sample script is below.

script.py

import scrapy

class UfcscraperSpider(scrapy.Spider):
    name = 'ufcscraper'

    start_urls = ['http://ufcstats.com/statistics/fighters?char=a']

    def parse(self, response):
        for user_info in response.css(".b-statistics__table-row")[2::]:
            result = {
                "fname": user_info.css("td:nth-child(1) a::text").get(),
                "lname": user_info.css("td:nth-child(2) a::text").get(),
                "nname": user_info.css("td:nth-child(3) a::text").get(),
                "height": user_info.css("td:nth-child(4)::text").get().strip(),
                "weight": user_info.css("td:nth-child(5)::text").get().strip(),
                "reach": user_info.css("td:nth-child(6)::text").get().strip(),
                "stance": user_info.css("td:nth-child(7)::text").get().strip(),
                "win": user_info.css("td:nth-child(8)::text").get().strip(),
                "lose": user_info.css("td:nth-child(9)::text").get().strip(),
                "draw": user_info.css("td:nth-child(10)::text").get().strip()
            }

        yield result

For instance nname field in the first row has a value of null while stance has a value of "", which is an empty string or so, how can i have a default value for such occurrences.

sample result

[
{"fname": "Tom", "lname": "Aaron", "nname": null, "height": "--", "weight": "155 lbs.", "reach": "--", "stance": "", "win": "5", "lose": "3", "draw": "0"},
{"fname": "Danny", "lname": "Abbadi", "nname": "The Assassin", "height": "5' 11\"", "weight": "155 lbs.", "reach": "--", "stance": "Orthodox", "win": "4", "lose": "6", "draw": "0"},
]

Upvotes: 0

Views: 138

Answers (1)

chitown88
chitown88

Reputation: 28620

You could either put in the logic to replace any "" in your function or you could just loop through the result and when you come across "" replaqce with whatever you'd like as the default.

data = [
{"fname": "Tom", "lname": "Aaron", "nname": "", "height": "--", "weight": "155 lbs.", "reach": "--", "stance": "", "win": "5", "lose": "3", "draw": "0"},
{"fname": "Danny", "lname": "Abbadi", "nname": "The Assassin", "height": "5' 11\"", "weight": "155 lbs.", "reach": "--", "stance": "Orthodox", "win": "4", "lose": "6", "draw": "0"},
]


for idx, each in enumerate(data):
    for k, v in each.items():
        if v == '':
            data[idx][k] = 'DEFAULT'

Output:

print(data)
[
{'fname': 'Tom', 'lname': 'Aaron', 'nname': 'DEFAULT', 'height': '--', 'weight': '155 lbs.', 'reach': '--', 'stance': 'DEFAULT', 'win': '5', 'lose': '3', 'draw': '0'}, 
{'fname': 'Danny', 'lname': 'Abbadi', 'nname': 'The Assassin', 'height': '5\' 11"', 'weight': '155 lbs.', 'reach': '--', 'stance': 'Orthodox', 'win': '4', 'lose': '6', 'draw': '0'}
]

Upvotes: 1

Related Questions