Reputation: 865
I'm using the SgmlLinkExtractor functionality in scrapy to parse specific urls.
I override start_requests function to crawl dynamic url.
start_requests(self): ..... yield Requests(url.strip(), callbackA)
Callback A does nothing right now.
I also implemented process_value for the SgmlLinkExtractor but it never called.
rules = [Rule(SgmlLinkExtractor(allow=()), callback=callbackB, follow=True),]
Again callbackB never called.
Upvotes: 0
Views: 415
Reputation: 8202
If your callbacks are declared in your spider, then they will not have global scope and you need to reference them as scoped to your class with self.
:
rules = [
Rule(SgmlLinkExtractor(), callback=self.callbackB, follow=True),
]
Upvotes: 0