find hyperlinks in reStructredText using docutils.parsers.rst

Question

For some internal consistency checking tool, I am trying to assemble a list of all hyperlinks (external references, images, etc ..) in an rst using python3.

I managed to parse the rst and walk the tree using the code below:

        parser = docutils.parsers.rst.Parser()
        components = (docutils.parsers.rst.Parser,)
        settings = docutils.frontend.OptionParser(components=components).get_default_values()
        document = docutils.utils.new_document('', settings=settings)
        parser.parse(f, document)

        class MyVisitor(docutils.nodes.NodeVisitor):
            def visit_reference(self, node: docutils.nodes.reference) -> None:
                """Called for "reference" nodes."""
                print("reference", node)

            def unknown_visit(self, node: docutils.nodes.Node) -> None:
                """Called for all other node types."""
                print("unknown_visit", node)

        visitor = MyVisitor(document)
        document.walk(visitor)

However, I am now completely stuck on how to find references to images and external links (URLs) within the result.

Does anyone know how to retrieve these external links programmatically from the parsed document?

find hyperlinks in reStructredText using docutils.parsers.rst

Answers (1)

Related Questions