Scrapy scraper not scraping past 1st page

Question

I am following a scrapy tutorial here. I have, I believe, got the same code as the tutorial, and yet my scraper only scrapes the first page, then gives the following message regarding my first Request to another page, and finishes. Have I perhaps got my second yield statement in the wrong place?

DEBUG: Filtered offsite request to 'newyork.craigslist.org': https://newyork.craigslist.org/search/egr?s=120>

2017-05-20 18:21:31 [scrapy.core.engine] INFO: Closing spider (finished)

Here is my code:

import scrapy
from scrapy import Request


class JobsSpider(scrapy.Spider):
    name = "jobs"
    allowed_domains = ["https://newyork.craigslist.org/search/egr"]
    start_urls = ['https://newyork.craigslist.org/search/egr/']

    def parse(self, response):
        jobs = response.xpath('//p[@class="result-info"]')

        for job in jobs:
            title = job.xpath('a/text()').extract_first()
            address = job.xpath('span[@class="result-meta"]/span[@class="result-hood"]/text()').extract_first("")[2:-1]
            relative_url = job.xpath('a/@href').extract_first("")
            absolute_url = response.urljoin(relative_url)

            yield {'URL': absolute_url, 'Title': title, 'Address': address}

        # scrape all pages
        next_page_relative_url = response.xpath('//a[@class="button next"]/@href').extract_first()
        next_page_absolute_url = response.urljoin(next_page_relative_url)

        yield Request(next_page_absolute_url, callback=self.parse)

Scrapy scraper not scraping past 1st page

Answers (1)

Related Questions