Python Scrapy Returning 200 But Closes Spider With Nothing

Question

New to Scrapy and trying to scrape some simple Html tables. I've found a site with the same schema for two different tables in the same page, however the scrape seems to work in one of the cases but not the other. Here's the link: https://fbref.com/en/comps/12/stats/La-Liga-Stats

My code that works (the first table, the one at the top):

import scrapy


class PostSpider(scrapy.Spider):

    name = 'stats'

    start_urls = [
        'https://fbref.com/en/comps/12/stats/La-Liga-Stats',
    ]

    def parse(self, response):
       for row in response.xpath('//*[@id="stats_standard_squads"]//tbody/tr'):
           yield {
               'players': row.xpath('td[2]//text()').extract_first(),
               'possession': row.xpath('td[3]//text()').extract_first(),
               'played': row.xpath('td[4]//text()').extract_first(),
               'starts': row.xpath('td[5]//text()').extract_first(),
               'minutes': row.xpath('td[6]//text()').extract_first(),
               'goals': row.xpath('td[7]//text()').extract_first(),
               'assists': row.xpath('td[8]//text()').extract_first(),
               'penalties': row.xpath('td[9]//text()').extract_first(),
           }

Now for some reason, when I try to scrape the table below (using the relevant xPath selector), it returns nothing:

import scrapy


class PostSpider(scrapy.Spider):

    name = 'stats'

    start_urls = [
        'https://fbref.com/en/comps/12/stats/La-Liga-Stats',
    ]

    def parse(self, response):

       for row in response.xpath('//*[@id="stats_standard"]//tbody/tr'):
           yield {
               'player': row.xpath('td[2]//text()').extract_first(),
               'nation': row.xpath('td[3]//text()').extract_first(),
               'pos': row.xpath('td[4]//text()').extract_first(),
               'squad': row.xpath('td[5]//text()').extract_first(),
               'age': row.xpath('td[6]//text()').extract_first(),
               'born': row.xpath('td[7]//text()').extract_first(),
               '90s': row.xpath('td[8]//text()').extract_first(),
               'att': row.xpath('td[9]//text()').extract_first(),
           }

Here's the logs from the terminal when I execute scrapy crawl stats:

2020-07-23 17:35:33 [scrapy.core.engine] DEBUG: Crawled (200)  (referer: None)
2020-07-23 17:35:33 [scrapy.core.engine] DEBUG: Crawled (200)  (referer: None)
2020-07-23 17:35:34 [scrapy.core.engine] INFO: Closing spider (finished)

What's the reason this is happening? The tables have an identical structure as far as I can see.

Python Scrapy Returning 200 But Closes Spider With Nothing

Answers (1)

Related Questions