LaurieFalcon
LaurieFalcon

Reputation: 103

Why my scrapy script just scrape the first page and not the others?

I'm trying to scrape some information on the website : http://quotes.toscrape.com/

But I cannot find a way to scrape all the pages, the script just scrape the first page, I don't understand what I'm doing wrong.

Here's my script so far :

import scrapy

from ..items import QuotetutorialItem

class QuoteSpider(scrapy.Spider):
    name = 'quotes'
    page_number = 2
    start_urls = ['http://quotes.toscrape.com/page/1/']

    def parse(self, response):

        items = QuotetutorialItem()

        all_div_quotes = response.css('div.quote')

        for quotes in all_div_quotes:   

            title = quotes.css('span.text::text').extract()
            author = quotes.css('.author::text').extract()
            tags = quotes.css('.tag::text').extract()

            items['title'] = title
            items['author'] = author
            items['tags'] = tags

            yield items

        next_page = 'http://quotes.toscrape.com/page/'+ str(QuoteSpider.page_number) + '/'


        if QuoteSpider.page_number < 11:
            QuoteSpider.page_number += 1
            yield response.follow(next_page, callback = self.parse)

And I type scrapy crawl quote in the terminal and it give me just the informations on the first page.

Any ideas ?

Thank you ?

Upvotes: 0

Views: 213

Answers (1)

Samsul Islam
Samsul Islam

Reputation: 2619

I think your code is ok. Its extract all information of 10 pages. Please add

items['url'] = response.url

in your parse function. Then recheck its extract 10 pages information or not.

Upvotes: 1

Related Questions