Scrapy: What's the correct way to use start_requests()?

Question

This is how my spider is set up

class CustomSpider(CrawlSpider):
    name = 'custombot'
    allowed_domains = ['www.domain.com']
    start_urls = ['http://www.domain.com/some-url']
    rules = ( 
              Rule(SgmlLinkExtractor(allow=r'.*?something/'), callback='do_stuff', follow=True),
            )

    def start_requests(self):
        return Request('http://www.domain.com/some-other-url', callback=self.do_something_else)

It goes to /some-other-url but not /some-url. What is wrong here? The url specified in start_urls are the ones that need links extracted and sent through the rules filter, where as the ones in start_requests are sent directly to the item parser so it doesn't need to pass through the rules filters.

Scrapy: What's the correct way to use start_requests()?

Answers (1)

Related Questions

Scrapy: What&#39;s the correct way to use start_requests()?

Answers (1)

Related Questions

Scrapy: What's the correct way to use start_requests()?