Bennett Brown
Bennett Brown

Reputation: 5383

insert newly constructed element in BeautifulSoup

I tried to insert a comment before a <div id="apb">. The error suggested a workaround, which did indeed work. Was I doing something wrong with BeautifulSoup or is there an error in the BeautifulSoup source? Minimal executable version of my original code:

from bs4 import BeautifulSoup
from bs4 import Comment
soup = BeautifulSoup('<p>This</p>Insert here:***<div id="apb">Stuff</div>')
div = soup.find(id="apb")
comment = Comment('APB section')
div.insert_before(comment)

This produces the traceback:

AttributeError                            Traceback (most recent call last)
<ipython-input-20-09e7eb15e6f2> in <module>()
  4 div = soup.find(id="apb")
  5 comment = Comment('APB section')
----> 6 div.insert_before(comment)
  7

C:\Users\bbrown\AppData\Local\Enthought\Canopy\User\lib\site-packages\bs4\element.pyc in insert_before(self, predecessor)
353         # are siblings.
354         if isinstance(predecessor, PageElement):
--> 355             predecessor.extract()
356         index = parent.index(self)
357         parent.insert(index, predecessor)

C:\Users\bbrown\AppData\Local\Enthought\Canopy\User\lib\site-packages\bs4\element.pyc in extract(self)
232     def extract(self):
233         """Destructively rips this element out of the tree."""
--> 234         if self.parent is not None:
235             del self.parent.contents[self.parent.index(self)]
236

C:\Users\bbrown\AppData\Local\Enthought\Canopy\User\lib\site-packages\bs4\element.pyc in __getattr__(self, attr)
673             raise AttributeError(
674                 "'%s' object has no attribute '%s'" % (
--> 675                     self.__class__.__name__, attr))
676
677     def output_ready(self, formatter="minimal"):

AttributeError: 'Comment' object has no attribute 'parent'

I am using Python 2.7. I think I am using beautifulsoup4 v4.3.2; that is the version reported by the Canopy Package Manager, through accessing BeautifulSoup.__version__ causes an AttributeError.

The reason I think the error described previously might be an error in the source code is that I succeeded with a workaround adding 5 lines of code:

comment.parent = None
comment.next_sibling = None
comment.next_element = None
comment.previous_sibling = None
comment.previous_element = None

I would think that the Comment constructor would have set those values to None or that the element.py code would test for attribute existence rather than testing for equality with None. Is the error mine or is it a problem with the BeautifulSoup source?

Upvotes: 3

Views: 402

Answers (1)

alecxe
alecxe

Reputation: 473833

Here is a related bug:

Either upgrade to the currently latest beautifulsoup4 4.4.1 which contains the fix:

pip install beautifulsoup4 --upgrade

Or, apply the workaround suggested by Martijn:

comment = soup.new_string('APB section', Comment)
div.insert_before(comment)

Upvotes: 3

Related Questions