Parsing and updating markdown file with Python

Question

I'm creating a script that will traverse a markdown file and update the any image tags from

![Daffy Duck](http://www.nonstick.com/wp-content/uploads/2015/10/Daffy.gif)

to

![Daffy Duck](http://www.nonstick.com/wp-content/uploads/2015/10/Daffy.gif?alt-text="Daffy Duck")

I'm new to Python, so I'm unsure about syntax and my approach, but my current thinking is to create an new empty string, traverse the original markdown line by line, if an image tag is detected splice the alt text to the correct location and add the lines to the new markdown string. The code I have so far looks like:

import markdown
from markdown.treeprocessors import Treeprocessor
from markdown.extensions import Extension


originalMarkdown = '''
## New Article
Lorem ipsum dolor sit amet, consectetur adipiscing elit. In pretium nunc ligula. Quisque bibendum vel lectus sed pulvinar. Phasellus id magna ac arcu iaculis facilisis. Curabitur tincidunt sed ipsum vel lacinia. Nulla et semper urna. Quisque ultrices hendrerit magna nec tempor. 

![Daffy Duck](http://www.nonstick.com/wp-content/uploads/2015/10/Daffy.gif)
Quisque accumsan sem mi. Nunc orci justo, laoreet vel metus nec, interdum euismod ipsum. 
![Bugs Bunny](http://www.nationalnannies.com/wp-content/uploads/2012/03/bugsbunny.png)
 Suspendisse augue odio, pharetra ac erat eget, volutpat ornare velit. Sed ac luctus quam. Sed id mauris erat. Duis lacinia faucibus metus, nec vehicula metus consectetur eu.
'''

updatedMarkdown = ""

# First create the treeprocessor
class AltTextExtractor(Treeprocessor):
    def run(self, doc):
        "Find all alt_text and append to markdown.alt_text. "
        self.markdown.alt_text = []
        for image in doc.findall('.//img'):
         self.markdown.alt_text.append(image.get('alt'))

# Then traverse the markdown file and concatenate the alt text to the end of any image tags
class ImageTagUpdater(Treeprocessor):
    def run(self, doc):
      # Set a counter
      count = 0
      # Go through markdown file line by line
        for line in doc:
          # if line is an image tag
          if line > ('.//img'):
            # grab the array of the alt text
            img_ext = ImgExtractor(md)
            # find the second to last character of the line
            location = len(line) - 1
            # insert the alt text
            line += line[location] + '?' +  '"' + img_ext[count] +  '"'
            # add line to new markdown file 
        updatedMarkdownadd.add(line)

The above code is pseudo code. I'm able to successfully extract the strings I need from the original file but I'm unable to concatenate those strings to their respective image tags and update the original file.

Parsing and updating markdown file with Python

Answers (1)

Related Questions