Reputation: 725
I am looking to perform text replacements in a shape's text. I am using code similar to snippet below:
# define key/value
SRKeys, SRVals = ['x','y','z'], [1,2,3]
# define text
text = shape.text
# iterate through values and perform subs
for i in range(len(SRKeys)):
# replace text
text = text.replace(SRKeys[i], str(SRVals[i]))
# write text subs to comment box
shape.text = text
However, if the initial shape.text
has formatted characters (bolded for example), the formatting is removed on the read. Is there a solution for this?
The only thing I could think of is to iterate over the characters and check for formatting, then add these formats before writing to shape.text
.
Upvotes: 3
Views: 1502
Reputation: 725
Here is an adapted version of the code I'm using (inspired by @scanny's answer). It replaces text for all shapes (with text frame) on a slide.
from pptx import Presentation
prs = Presentation('../../test.pptx')
slide = prs.slides[1]
# iterate through all shapes on slide
for shape in slide.shapes:
if not shape.has_text_frame:
continue
# iterate through paragarphs in shape
for p in shape.text_frame.paragraphs:
# store formats and their runs by index (not dict because of duplicate runs)
formats, newRuns = [], []
# iterate through runs
for r in p.runs:
# get text
text = r.text
# replace text
text = text.replace('s','xyz')
# store run
newRuns.append(text)
# store format
formats.append({'size':r.font.size,
'bold':r.font.bold,
'underline':r.font.underline,
'italic':r.font.italic})
# clear paragraph
p.clear()
# iterate through new runs and formats and write to paragraph
for i in range(len(newRuns)):
# add run with text
run = p.add_run()
run.text = newRuns[i]
# format run
run.font.bold = formats[i]['bold']
run.font.italic = formats[i]['italic']
run.font.size = formats[i]['size']
run.font.underline = formats[i]['underline']
prs.save('../../test.pptx')
Upvotes: 2
Reputation: 28893
@usr2564301 is on the right track. Character formatting (aka. "font") is specified at the run level. This is what a run is; a "run" (sequence) of characters all sharing the same character formatting.
When you assign to shape.text
you replace all the runs that used to be there with a single new run having default formatting. If you want to preserve formatting you need to preserve whatever runs are not directly involved in the text replacement.
This is not a trivial problem because there is no guarantee runs break on word boundaries. Try printing out the runs for a few paragraphs and I think you'll see what I mean.
In rough pseudocode, I think this is the approach you would need to take:
This preserves any runs that do not involve the search string and preserves the formatting of the "matched" word in the "replaced" word.
This requires a few operations that are not directly supported by the current API. For those you'd need to use lower-level lxml
calls to directly manipulate the XML, although you could get hold of all the existing elements you need from python-pptx
objects without ever having to parse in the XML yourself.
Upvotes: 2