Zlo
Zlo

Reputation: 1170

Circumventing Python errors in a script

I have a large file containing thousands of links. I've written a script calling each link line-by-line and performing various analyses on the respective webpage. However, sometimes it is the case that the link is faulty (article removed from website, etc), and my whole script just stops at that point.

Is there a way to circumvent this problem? Here's my (pseudo)code:

for row in file:
    url = row[4]
    req=urllib2.Request(url)
    tree = lxml.html.fromstring(urllib2.urlopen(req).read())
    perform analyses
    append analyses results to lists
output data

I have tried

except:
    pass

But it royally messes up the script for some reason.

Upvotes: 0

Views: 38

Answers (2)

user2097159
user2097159

Reputation: 892

Try block is the way to go:

for row in file:
url = row[4]
    try:
        req=urllib2.Request(url)
        tree = lxml.html.fromstring(urllib2.urlopen(req).read())
    except URLError, e:
        continue
    perform analyses
    append analyses results to lists
output data

Continue will allow you to skip any unnecessary computation after the url check and restart at the next iteration of the loop

Upvotes: 0

Vincent Beltman
Vincent Beltman

Reputation: 2104

Works for me:

for row in file:
    url = row[4]
    try:
        req=urllib2.Request(url)
        tree = lxml.html.fromstring(urllib2.urlopen(req).read())
        perform analyses
        append analyses results to lists
    except URLError, e:
        pass
output data

Upvotes: 2

Related Questions