Remove tag from a string by HREF attribute

Question

I have an HTML body, a possible extract:

body = 'Hi what is your name?....other stuffs'

This could be more much longer with others HTML tags and maybe others too.

I also have one url i want to remove from the body:

url_to_remove = 'url_example_1'

Is there a regex or other way to get this new body removing url_to_remove tag?

My new body should be:

new_body = 'Hi what is your name?....other stuffs'

shoytov · Accepted Answer

Try this:

from bs4 import BeautifulSoup

body = 'HTML code here'
to_delete = 'depricated url'
soup = BeautifulSoup(body)
elements = soup.find_all("a")
for element in elements:
    if element['href'] == to_delete:
        element.replace_with("%s" % element.text)
body = soup

print(body)

Remove <a> tag from a string by HREF attribute

Answers (1)

Related Questions

Remove &lt;a&gt; tag from a string by HREF attribute

Answers (1)

Related Questions

Remove <a> tag from a string by HREF attribute