Alex Zaitsev
Alex Zaitsev

Reputation: 701

Do not render image from unlisted hosts in python markdown

I use Python-Markdown to render user generated content. I'd like to change pictures from external sources to links.

So i have a list of storages:

storages = ['foo.com', 'bar.net']

and i need to replace

![](http://external.com/image.png)

to something like:

[http://external.com/image.png](http://external.com/image.png)

if host not in storages.

I tried to edit markdown-text before saving to database but it's not good solution as user may want to edit his data and discover data was modified. So i want to do that replacement on render.

Upvotes: 0

Views: 217

Answers (1)

Waylan
Waylan

Reputation: 42497

One solution to your question is demonstrated in this tutorial:

from markdown.treeprocessors import Treeprocessor
from markdown.extensions import Extension
from urllib.parse import urlparse


class InlineImageProcessor(Treeprocessor):
    def __init__(self, md, hosts):
        self.md = md
        self.hosts = hosts

    def is_unknown_host(self, url):
        url = urlparse(url)
        return url.netloc and url.netloc not in self.hosts

    def run(self, root):
        for element in root.iter('img'):
            attrib = element.attrib
            if self.is_unknown_host(attrib['src']):
                tail = element.tail
                element.clear()
                element.tag = 'a'
                element.set('href', attrib.pop('src'))
                element.text = attrib.pop('alt')
                element.tail = tail
                for k, v in attrib.items():
                    element.set(k, v)


class ImageExtension(Extension):
    def __init__(self, **kwargs):
        self.config = {'hosts' : [[], 'List of approved hosts']}
        super(ImageExtension, self).__init__(**kwargs)

    def extendMarkdown(self, md):
        md.treeprocessors.register(
            InlineImageProcessor(md, hosts=self.getConfig('hosts')),
           'inlineimageprocessor',
           15
        )

Testing it out:

>>> import markdown
>>> from image-extension import ImageExtension
>>> input = """
... ![a local image](/path/to/image.jpg)
... 
... ![a remote image](http://example.com/image.jpg)
... 
... ![an excluded remote image](http://exclude.com/image.jpg)
... """
>>> print(markdown.markdown(input, extensions=[ImageExtension(hosts=['example.com'])]))
<p><img alt="a local image" src="/path/to/image.jpg"/></p>
<p><img alt="a remote image" src="http://example.com/image.jpg"/></p>
<p><a href="http://exclude.com/image.jpg">an excluded remote image</a></p>

Full disclosure: I am the lead developer of Python-Markdown. We needed another tutorial which demonstrated some additional features of the extension API. I saw this question and thought it would make a good candidate. Therefore, I wrote up the tutorial, which steps through the development process to end up with the result above. Thank you for the inspiration.

Upvotes: 2

Related Questions