Reputation: 53
I am trying to find all num's in a list from an html using beautifulsoup
:
import urllib
from BeautifulSoup import *
import re
line = None
url = raw_input('Enter - ')
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html)
# Retrieve all of the anchor tags
tags = soup('span')
for line in tags:
line = line.strip()
numlist = re.findall('[0-9]+' , tags)
print numlist`
I'm getting a traceback:
Traceback (most recent call last): File "C:\Documents and Settings\mea388\Desktop\PythonSchool\new 12.py", line 14, in line = line.strip() TypeError: 'NoneType' object is not callable
I cannot understand why I'm getting a traceback.
Upvotes: 0
Views: 3021
Reputation: 621
That's because you are trying to run strip on the tag class within beautiful soup.
Change line 14 to:
line = line.string.strip()
However be aware that this can still be None when the tag you are searching for has multiple sub elements. Seee link to string method on doco for beautiful soup
Upvotes: 2