Why my scrapy script just scrape the first page and not the others?

Question

I'm trying to scrape some information on the website : http://quotes.toscrape.com/

But I cannot find a way to scrape all the pages, the script just scrape the first page, I don't understand what I'm doing wrong.

Here's my script so far :

import scrapy

from ..items import QuotetutorialItem

class QuoteSpider(scrapy.Spider):
    name = 'quotes'
    page_number = 2
    start_urls = ['http://quotes.toscrape.com/page/1/']

    def parse(self, response):

        items = QuotetutorialItem()

        all_div_quotes = response.css('div.quote')

        for quotes in all_div_quotes:   

            title = quotes.css('span.text::text').extract()
            author = quotes.css('.author::text').extract()
            tags = quotes.css('.tag::text').extract()

            items['title'] = title
            items['author'] = author
            items['tags'] = tags

            yield items

        next_page = 'http://quotes.toscrape.com/page/'+ str(QuoteSpider.page_number) + '/'


        if QuoteSpider.page_number < 11:
            QuoteSpider.page_number += 1
            yield response.follow(next_page, callback = self.parse)

And I type scrapy crawl quote in the terminal and it give me just the informations on the first page.

Any ideas ?

Thank you ?

Samsul Islam · Accepted Answer

I think your code is ok. Its extract all information of 10 pages. Please add

items['url'] = response.url

in your parse function. Then recheck its extract 10 pages information or not.

Why my scrapy script just scrape the first page and not the others?

Answers (1)

Related Questions