Reputation: 21
I wrote the following loop in my web crawler.
It runs out after some seconds. I can't figure out why.
def crawlweb(seed):
crawled = []
tocrawl = [seed]
page = tocrawl[0]
while tocrawl:
if page not in crawled:
tocrawl = tocrawl[1:] + (get_links(get_page(page)))
crawled.append(page)
return crawled, tocrawl
Upvotes: 0
Views: 352
Reputation: 21
def crawl_web(seed)
tocrawl = [seed]
crawled = []
while tocrawl:
page = tocrawl.pop()
if page not in crawled:
union(tocrawl, get_all_links(get_page(page)))
crawled.append(page)
return crawled
Upvotes: 1