Reputation: 1036
To extract the texts that I need, I am able to scrape most webpages using Beautifulsoup's find_next_sibling in my conditional execution.
if len(str(h4.find_next_sibling)) < 90:
...
else:
...
For one particular page, however, the webpage is empty so Python reports the error:
AttributeError: 'NoneType' object has no attribute 'find_next_sibling'
Since the empty page(s) seem to be produced by error in the list of pages I plan to scrape and I need Python to continue scraping without stopping at every similar instance, one possibility is to write a if condition to only run the code above when there is actually find_next_sibling in the page. Is it possible to do it? Any thoughts are appreciated!
Upvotes: 1
Views: 558
Reputation: 1036
Huge thanks to the commenters, this issue is successfully solved using try: except:
try:
if len(str(h4.find_next_sibling)) < 90:
coverage = h4.find_next_sibling(text=True)
else:
if len(str(h4.find_next_sibling)[1]) < 90:
coverage = h4.find_next_siblings(text=True)[2]
else:
coverage = h4.find_next_siblings(text=True)[1]
except:
coverage = "Empty page"
Upvotes: 2