Evg
Evg

Reputation: 3080

Replace HTML links with text

How to replace links with anchors in html (python)?

for example input:

 <p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>

i want at result with saved p tag (just a tag remove):

<p>
Hello link text1 and link text2 ! 
</p>

Upvotes: 4

Views: 5102

Answers (3)

miindlek
miindlek

Reputation: 3563

You could do this with a simple regex and the sub function:

import re

text = '<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'
pattern =r'<(a|/a).*?>'

result = re.sub(pattern , "", text)

print result
'<p> Hello link text1 and link text2 ! </p>'

This code replaces all occuring <a..> and </a> tags with an empty string.

Upvotes: 5

Nitin
Nitin

Reputation: 23

You can use Parser Library for it.. like BeautifulSoup and other also. I am not sure for it, but you can get something here

Upvotes: 0

shaktimaan
shaktimaan

Reputation: 12092

Looks like a perfect case for BeautifulSoup's unwrap() method:

from bs4 import BeautifulSoup
data = '''<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'''
soup = BeautifulSoup(data)
p_tag = soup.find('p')
for _ in p_tag.find_all('a'):
    p_tag.a.unwrap()
print p_tag

This gives:

<p> Hello link text1 and link text2 ! </p>

Upvotes: 3

Related Questions