Reputation: 3080
How to replace links with anchors in html (python)?
for example input:
<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>
i want at result with saved p tag (just a tag remove):
<p>
Hello link text1 and link text2 !
</p>
Upvotes: 4
Views: 5102
Reputation: 3563
You could do this with a simple regex and the sub
function:
import re
text = '<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'
pattern =r'<(a|/a).*?>'
result = re.sub(pattern , "", text)
print result
'<p> Hello link text1 and link text2 ! </p>'
This code replaces all occuring <a..>
and </a>
tags with an empty string.
Upvotes: 5
Reputation: 23
You can use Parser Library for it.. like BeautifulSoup and other also. I am not sure for it, but you can get something here
Upvotes: 0
Reputation: 12092
Looks like a perfect case for BeautifulSoup's unwrap()
method:
from bs4 import BeautifulSoup
data = '''<p> Hello <a href="http://example.com">link text1</a> and <a href="http://example.com">link text2</a> ! </p>'''
soup = BeautifulSoup(data)
p_tag = soup.find('p')
for _ in p_tag.find_all('a'):
p_tag.a.unwrap()
print p_tag
This gives:
<p> Hello link text1 and link text2 ! </p>
Upvotes: 3