Reputation: 10570
I want to extract all the data from a website.
I am using scrapy 0.20.2
My code is
class MySpider(CrawlSpider):
start_urls = ['TheWebsite']
rules = [Rule(SgmlLinkExtractor(allow=['/?page=\d+']), 'parse')]
def parse(self, response):
sites = sel.xpath('MyXPath')
for site in sites:
if condition < 8:
yield Request(Link, meta = {'date': Date},\
callback = self.MyFunction)
else:
# Code to stop scrapy goes here.
the crawler will scrapy all the data from the url that has this syntax:
Mywebsite?page=INTEGER
but when a specific condition happened, I want to stop crawling. In my code I want to do that when the else
happened. How please?
Upvotes: 0
Views: 2190
Reputation: 122024
To exit the for
loop at that point, use break
:
for site in sites:
if condition < 8:
# ...
else:
break
This will put you outside the for
loop and therefore exit parse
. If you need to send a value back, rather than implicitly return None
, you can return
instead of break
, which will also exit the function. break
also allows you to have further code in your function:
for ...:
if something:
break
# do something else before finishing
Upvotes: 1