Reputation: 3178
I have looked at similar posts, which come close to my case, but my result nonetheless seems unexpected.
import BeautifulSoup
import re
soup = BeautifulSoup.BeautifulSoup(<html page of interest>)
if (soup.find_all("td", attrs= {"class": "FilterElement"}, text= re.compile("HERE IS TEXT I AM LOOKING FOR")) is None):
print('There was no entry')
else:
print(soup.find("td", attrs= {"class": "FilterElement"}, text= re.compile("HERE IS THE TEXT I AM LOOKING FOR")))
I obviously filtered out the actual HTML page, as well as the text in the regular expression. The rest is exactly as written. I get the following error:
Traceback (most recent call last):
File "/Users/appa/src/workspace/web_forms/WebForms/src/root/queryForms.py", line 51, in <module>
LoopThroughDays(form, id, trailer)
File "/Users/appa/src/workspace/web_forms/WebForms/src/root/queryForms.py", line 33, in LoopThroughDays
if (soup.find_all("td", attrs= {"class": "FilterElement"}, text= re.compile("HERE IS THE TEXT I AM LOOKING FOR")) is None):
TypeError: 'NoneType' object is not callable
I understand that the text will sometimes be missing. But I thought that the way I have set up the if
statement was precisely able to capture when it is missing, and therefore a NoneType
.
Thanks in advance for any help!
Upvotes: 1
Views: 3200
Reputation: 454
It looks like it's just a typo. It should be soup.findAll
not soup.find_all
. I tried running it, and it works with the correction. So the full program should be:
import BeautifulSoup
import re
soup = BeautifulSoup.BeautifulSoup(<html page of interest>)
if (soup.findAll("td", attrs= {"class": "FilterElement"}, text= re.compile("HERE IS TEXT I AM LOOKING FOR")) is None):
print('There was no entry')
else:
print(soup.find("td", attrs= {"class": "FilterElement"}, text= re.compile("HERE IS THE TEXT I AM LOOKING FOR")))<html page of interest>
Upvotes: 2