hengyue li
hengyue li

Reputation: 468

<> becomes &lt &gt in beautifulsoup

Assume I have item div, div is a beautifulsoup object (obtained by findAll). The source looks like:

<div>text1 <span>text2</span></div>

What I want to do is to replace text1 with text3. I tried:

  1. div.string.replace_with(newstr), where newstr="text3 <span>text2</span>" This does not work because div.string is None

  2. div.replace_with(newstr)
    This does not work because the final result shows &lt and&gt rather than "<" and ">" when I save the html code into file.

Upvotes: 1

Views: 1180

Answers (2)

KunduK
KunduK

Reputation: 33384

You can find div tag and then find next_element which is text1 and then replace_with text3

from bs4 import BeautifulSoup

html= '''<div>text1 <span>text2</span></div>'''
soup = BeautifulSoup(html, 'lxml')
soup.find('div').next_element.replace_with('text3')
print(soup)

Upvotes: 1

user5386938
user5386938

Reputation:

Just playing around with the interactive prompt... I'm sure there's a better solution but...

from bs4 import BeautifulSoup

data = '''<div>text1 <span>text2</span></div>'''
soup = BeautifulSoup(data, features="lxml")
div = soup.find('div')
a, *b = div.contents
c = a.replace('text1', 'text3')
a.replace_with(c)
print(div)

Upvotes: 0

Related Questions