Why only one result in loop scrapy

Question

I'm trying to use scrapy to crawl some page with a lot of links inside, but my existing code so far only show the contents of the first link.

What mistake have I made?

from scrapy.spiders import BaseSpider
from scrapy.spiders import Spider
from scrapy.http.request import Request
from scrapy.selector import Selector
from Proje.items import ProjeItem

class ProjeSpider(BaseSpider):
    name = "someweb"
    allowed_domains = ["someweb.com"]
    start_urls = [
        "http://someweb.com/indeks/"
    ]

def parse(self, response):
    for sel in response.xpath('//ul[@id="indeks-container"]'):
        for tete in sel.xpath('//linkkk').re('//linkkk.*?(?=")'):
           links = 'http:'+str(tete)
           req = Request(links,callback=self.kontene)
           return req

def kontene(self, response):
    for mbuh in response.xpath('//head'):
        Item = ProjeItem()
        Item['title'] = mbuh.xpath('//title/text()').extract()
        yield Item

miraculixx · Accepted Answer

according to the scrapy docs, parse needs to return an interable of Request, i.e. a list or a generator. Just change return to yield and it should work as expected:

def parse(self, response):
    for sel in response.xpath('//ul[@id="indeks-container"]'):
        for tete in sel.xpath('//linkkk').re('//linkkk.*?(?=")'):
           links = 'http:'+str(tete)
           req = Request(links,callback=self.kontene)
           yield req

Why only one result in loop scrapy

Answers (2)

Related Questions