tlre0952b
tlre0952b

Reputation: 751

Python error NoneType

I have a Python script that uses BS4 to grab the html of a webpage. Then I locate a specific header field in the html to extract the text. I do this with the following:

r = br.open("http://example.com")
html = r.read()
r.close()
soup = BeautifulSoup(html)
# Get the contents of the html tag (h1) that displays results
searchResult = soup.find("h1").contents[0]
# Get only the number, remove all text
if not(searchResult == None):
    searchResultNum = int(re.match(r'\d+', searchResult).group())
else:
    searchResultNum = 696969

The actual HTML code doesn't change. It always looks like this:

<div id="resultsCount">
    <h1 class="f12">606 Results matched</h1>
</div>

The problem is, my script runs fine for maybe 4 minutes (varies) and crashes with:

Traceback (most recent call last):
  File "C:\Users\Me\Documents\Aptana Studio 3 Workspace\PythonScripts\PythonScripts\setupscript.py", line 109, in <module>
    searchResultNum = int(re.match(r'\d+', searchResult).group())
AttributeError: 'NoneType' object has no attribute 'group'

I thought I was handling this error. I guess I just do not understand it. Can you help?

Thanks.

Upvotes: 0

Views: 88

Answers (1)

cmd
cmd

Reputation: 5830

If searchResult does not start with a number re.match(r'\d+', searchResult) will be None and None does not have a group attribute. Also if not(searchResult == None): is kinda bad, use if searchResult:

searchResultNum = 696969
if searchResult:
    m = re.match(r'\d+', searchResult)
    if m:
        searchResultNum = int(m.group())

Upvotes: 1

Related Questions