Reputation: 159
I have some HTML code (for display in a browser) in a string which contains any number of svg images such as:
<table>
<tr>
<td><img src="http://localhost/images/Store.Tools.svg"></td>
<td><img src="http://localhost/images/Store.Diapers.svg"></td>
</tr>
</table>
I would like to find all HTML links and replace them to the following (in order to attach them as email):
<table>
<tr>
<td><cid:image1></td><td><cid:image2></td>
</tr>
</table>
SVG filenames can have any arbitrary number of dots, chars, and numbers.
What's the best way to do this in python?
Upvotes: 2
Views: 196
Reputation: 474141
I'd use an HTML Parser to find all img
tags and replace them.
Example using BeautifulSoup
and it's replace_with()
:
from bs4 import BeautifulSoup
data = """
<table><tr>
<td><img src="http://localhost/images/Store.Tools.svg"></td>
<td><img src="http://localhost/images/Store.Diapers.svg"></td>
</tr></table>
"""
soup = BeautifulSoup(data, 'html.parser')
for index, image in enumerate(soup.find_all('img'), start=1):
tag = soup.new_tag('img', src='cid:image{}'.format(index))
image.replace_with(tag)
print soup.prettify()
Prints:
<table>
<tr>
<td>
<img src="cid:image1"/>
</td>
<td>
<img src="cid:image2"/>
</td>
</tr>
</table>
Upvotes: 3