user3839860
user3839860

Reputation: 21

How to use scrapy to crawl multi level pages? (two level)

On the first page it well scraping the text "test1" in title tag but nothing in page two "test2.html" my script:

from scrapy.spider import Spider

from scrapy.selector import Selector

from testscrapy1.items import Website

class DmozSpider(Spider):

     name = "bill"
     allowed_domains = ["http://www.mywebsite.com"]
     start_urls = [
         "http://www.mywebsite.com/test.html"]



def parse(self,response):

    for site in response.xpath('//head'):
        item = Website()
        item['title'] = site.xpath('//title/text()').extract()
        yield item

    yield scrapy.Request(url="www.mywebsite.com/test1.html", callback=self.other_function)

def other_function(self,response):

    for other_thing in response.xpath('//head'):
        item = Website()
        item['title'] = other_thing.xpath('//title/text()').extract()
        yield item

Thank you in advance STEF

Upvotes: 2

Views: 1293

Answers (1)

sunny
sunny

Reputation: 748

Try

yield scrapy.Request(url="www.mywebsite.com", callback=self.other_function)

instead of

yield scrapy.Request(url="www.mywebsite.com/test1.html", callback=self.other_function)

Upvotes: 2

Related Questions