Kwinten
Kwinten

Reputation: 11

Scrapy not parsing data

I'm new to scrapy, and i'm trying to retrieve my favourite team's score in a json file. However, my json file stays empty.

Here's my code :

import scrapy
from scrapy.crawler import CrawlerProcess


class SoccerwaySpider(scrapy.Spider):
    name="Soccerway"
    start_urls = ['https://fr.soccerway.com/teams/france/olympique-de-marseille/890/']

    def start_requests(self):
        headers= {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:48.0) Gecko/20100101 Firefox/48.0'}
        for url in self.start_urls:
            yield scrapy.Request(url, headers=headers, callback=self.parse)

    def parse(self,response):
        yield
        {
        'score':str.strip(response.css("table.matches").css('td.score-time.score').css('a::text').get()),
        }

process = CrawlerProcess(settings={
    "FEEDS": {
        "Soccerway.json": {"format": "json"},
    },
})
process.crawl(SoccerwaySpider)
process.start()

Thank you in advance!

Upvotes: 0

Views: 129

Answers (2)

SuperUser
SuperUser

Reputation: 4822

import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings


class SoccerwaySpider(scrapy.Spider):
    name = "Soccerway"
    start_urls = ['https://fr.soccerway.com/teams/france/olympique-de-marseille/890/']
    custom_settings={"FEEDS": {"Soccerway.json": {"format": "json"}}}

    def start_requests(self):
        headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:48.0) Gecko/20100101 Firefox/48.0'
        }
        for url in self.start_urls:
            yield scrapy.Request(url, headers=headers, callback=self.parse)

    def parse(self, response):
        yield {
            'score': str.strip(response.css("table.matches").css('td.score-time.score').css('a::text').get()),
        }


if __name__ == "__main__":
    process = CrawlerProcess(get_project_settings())
    process.crawl('Soccerway')
    process.start()

Soccerway.json:

[
{"score": "2 - 2"}
]

Upvotes: 0

furas
furas

Reputation: 142631

You have problem because you put { in wrong place. It has to be in line with yield

yield {
    'score': ...,
}

If you put in other line then it treats it as two commands

# command 1 - exit function without arguments
yield 

# command 2 - create local dictionary without assigning to variable
{
    'score': ...,
}

Upvotes: 1

Related Questions