Reputation: 1607
When I try to insert the following HTML into an element
<div class="frontpageclass"><h3 id="feature_title">The Title</h3>... </div>
bs4
is replacing it like this:
<div class="frontpageclass"><h3 id="feature_title">The Title </h3>... <div></div>
I am using string
and it is still messing up the format.
with open(html_frontpage) as fp:
soup = BeautifulSoup(fp,"html.parser")
found_data = soup.find(class_= 'front-page__feature-image')
found_data.string = databasedata
If I try to use found_data.string.replace_with
I get a NoneType error. found_data
is of type tag.
similar issue but they are using div, not class
Upvotes: 5
Views: 4552
Reputation: 1
For the messing format, there are only & lt; and & gt; corresponding to '<' and '>'. Just replace all of them should work.
eg. suppose beautifulsoup inserts html tag into soup1 variable with messing format: a=str(soup1).replace(& lt;,'<').replace(& gt;,'>'); print(a)
In real code, one should put & lt; inside ' ' and no space in between. (Here, web displays & lt; with no space same as <)
So variable a should work with correct format.
Upvotes: 0
Reputation: 338158
Setting the element .text
or .string
causes the value to be HTML-encoded, which is the right thing to do. It ensures that the text you insert will appear 1:1 when the document is displayed in a browser.
If you want to insert actual HTML, you need to insert new nodes into the tree.
from bs4 import BeautifulSoup
# always define a file encoding when working with text files
with open(html_frontpage, encoding='utf8') as fp:
soup = BeautifulSoup(fp, "html.parser")
target = soup.find(class_= 'front-page__feature-image')
# empty out the target element if needed
target.clear()
# create a temporary document from your HTML
content = '<div class="frontpageclass"><h3 id="feature_title">The Title</h3>...</div>'
temp = BeautifulSoup(content)
# the nodes we want to insert are children of the <body> in `temp`
nodes_to_insert = temp.find('body').children
# insert them, in source order
for i, node in enumerate(nodes_to_insert):
target.insert(i, node)
Upvotes: 8