Reputation: 103
I'm trying to scrape some information on the website : http://quotes.toscrape.com/
But I cannot find a way to scrape all the pages, the script just scrape the first page, I don't understand what I'm doing wrong.
Here's my script so far :
import scrapy
from ..items import QuotetutorialItem
class QuoteSpider(scrapy.Spider):
name = 'quotes'
page_number = 2
start_urls = ['http://quotes.toscrape.com/page/1/']
def parse(self, response):
items = QuotetutorialItem()
all_div_quotes = response.css('div.quote')
for quotes in all_div_quotes:
title = quotes.css('span.text::text').extract()
author = quotes.css('.author::text').extract()
tags = quotes.css('.tag::text').extract()
items['title'] = title
items['author'] = author
items['tags'] = tags
yield items
next_page = 'http://quotes.toscrape.com/page/'+ str(QuoteSpider.page_number) + '/'
if QuoteSpider.page_number < 11:
QuoteSpider.page_number += 1
yield response.follow(next_page, callback = self.parse)
And I type scrapy crawl quote
in the terminal and it give me just the informations on the first page.
Any ideas ?
Thank you ?
Upvotes: 0
Views: 213
Reputation: 2619
I think your code is ok. Its extract all information of 10 pages. Please add
items['url'] = response.url
in your parse function. Then recheck its extract 10 pages information or not.
Upvotes: 1