BeautifulSoup : create and insert self closing tag between a tag

Question

I am parsing html files and replacing specific links with new tags.

Python Code:

from bs4 import BeautifulSoup
sample='''{Image src='https://google.com' link='google.com'}'''
soup=BeautifulSoup(sample)
for a in soup.findAll('a'):
    x=BeautifulSoup(' ')
    a=a.replace_with(x)

print(soup)

Actual Output:

Desired Output:

The self closing Tags are automatically getting converted. The destination strictly needs self closing tags.

Any help would be appreciated!

Andrej Kesely · Accepted Answer

To get correct self closing tags, use parser xml when creating new soup that is going to replace old tag.

Also, to preserve ac and ri namespaces, xml parser requires to define xmlns:ac and xmlns:ri parameters. We define these parameters in a dummy tag that is removed after processing.

For example:

from bs4 import BeautifulSoup
import xml
txt = '''

    
        
    


    
        
    

'''

template = '''


  


'''

soup = BeautifulSoup(txt, 'html.parser')

for a in soup.select('a'):
    a=a.replace_with(BeautifulSoup(template.format(img_src=a.img['src']), 'xml'))  # <-- select `xml` parser, the template needs to have xmlns:* parameters to preserve namespaces

for div in soup.select('div._remove_me'):
    dump=div.unwrap()

print(soup.prettify())

Prints:

BeautifulSoup : create and insert self closing tag between a tag

Answers (1)

Related Questions