Reputation: 3086
I am using the below code to scrape over XFN content from web page http://ajaxian.com but I am getting undefined variable error:
My code is as follows:
'''
Created on Jan 11, 2013
@author: Somnath
'''
# Scraping XFN content from a web page
# -*-coding: utf-8 -*-
import sys
import urllib2
import HTMLParser
from BeautifulSoup import BeautifulSoup
# Try http://ajaxian.com
URL = sys.argv[0]
XFN_TAGS = set([
'colleague',
'sweetheart',
'parent',
'co-resident',
'co-worker',
'muse',
'neighbor',
'sibling',
'kin',
'child',
'date',
'spouse',
'me',
'acquaintance',
'met',
'crush',
'contact',
'friend',
])
try:
page = urllib2.urlopen(URL)
except urllib2.URLError:
print 'Failed to fetch ' + item
try:
soup = BeautifulSoup(page)
except HTMLParser.HTMLParseError:
print 'Failed to parse ' + item
anchorTags = soup.findAll('a')
for a in anchorTags:
if a.has_key('rel'):
if len(set(a['rel'].split()) & XFN_TAGS) > 0:
tags = a['rel'].split()
print a.contents[0], a['href'], tags
I have two try blocks in my code and it is giving an error undefined variable : item. If I want to re-include the try-except blocks, should I give a blank definition of variable, item outside the try blocks?
P.S: Please note that is a standard code followed from a book. And I expect that they would not have made such a trivial mistake. Am I getting something wrong here ?
Upvotes: 0
Views: 3875
Reputation: 448
You did not define the variable 'item'. That's what is causing the error. You must define a variable before you use it.
Upvotes: 0
Reputation: 66647
print 'Failed to fetch ' + item
item is not defined any where. I guess you wanted to print URL there.
As per python tutorial
Variables must be “defined” (assigned a value) before they can be used, or an error will occur:
Upvotes: 2
Reputation: 37269
Assuming that you want to print the URL that failed to load, try changing it to print 'Failed to fetch ' + URL
. You aren't actually defining item
anywhere, so Python doesn't know what you mean:
try:
page = urllib2.urlopen(URL)
except urllib2.URLError:
print 'Failed to fetch ' + URL
And in your second block, change item
to URL
as well (assuming the error you want to display shows the URL and not the content).
try:
soup = BeautifulSoup(page)
except HTMLParser.HTMLParseError:
print 'Failed to parse ' + URL
Upvotes: 2