Reputation: 225
I'm creating a script that will traverse a markdown file and update the any image tags from

to

I'm new to Python, so I'm unsure about syntax and my approach, but my current thinking is to create an new empty string, traverse the original markdown line by line, if an image tag is detected splice the alt text to the correct location and add the lines to the new markdown string. The code I have so far looks like:
import markdown
from markdown.treeprocessors import Treeprocessor
from markdown.extensions import Extension
originalMarkdown = '''
## New Article
Lorem ipsum dolor sit amet, consectetur adipiscing elit. In pretium nunc ligula. Quisque bibendum vel lectus sed pulvinar. Phasellus id magna ac arcu iaculis facilisis. Curabitur tincidunt sed ipsum vel lacinia. Nulla et semper urna. Quisque ultrices hendrerit magna nec tempor.

Quisque accumsan sem mi. Nunc orci justo, laoreet vel metus nec, interdum euismod ipsum.

Suspendisse augue odio, pharetra ac erat eget, volutpat ornare velit. Sed ac luctus quam. Sed id mauris erat. Duis lacinia faucibus metus, nec vehicula metus consectetur eu.
'''
updatedMarkdown = ""
# First create the treeprocessor
class AltTextExtractor(Treeprocessor):
def run(self, doc):
"Find all alt_text and append to markdown.alt_text. "
self.markdown.alt_text = []
for image in doc.findall('.//img'):
self.markdown.alt_text.append(image.get('alt'))
# Then traverse the markdown file and concatenate the alt text to the end of any image tags
class ImageTagUpdater(Treeprocessor):
def run(self, doc):
# Set a counter
count = 0
# Go through markdown file line by line
for line in doc:
# if line is an image tag
if line > ('.//img'):
# grab the array of the alt text
img_ext = ImgExtractor(md)
# find the second to last character of the line
location = len(line) - 1
# insert the alt text
line += line[location] + '?' + '"' + img_ext[count] + '"'
# add line to new markdown file
updatedMarkdownadd.add(line)
The above code is pseudo code. I'm able to successfully extract the strings I need from the original file but I'm unable to concatenate those strings to their respective image tags and update the original file.
Upvotes: 0
Views: 7027
Reputation: 99
Provided your files aren't huge, it might be easier to overwrite the file, rather than try to wedge little bits in here or there.
orig = ''
new = ''
with open(filename, 'r') as f:
text = f.readlines()
new_text = "\n".join([line if line != orig else new for line in text])
with open(filename, 'w') as f:
f.write(new_text)
You could also use regex re.sub, but I suppose its a matter of preference.
Upvotes: 2