Reputation: 11
I am having trouble searching and bolding certain words from a dictionary. Right now this bolds the entire thing.
# Convert to .docx
with open(file, 'r', encoding='utf-8') as openfile:
line = openfile.read()
doc.add_paragraph(line)
# Bold speaker names
for i in dictionary:
for p in doc.paragraphs:
if p.text.find(i) >= 0:
p.text = p.text.replace(i, dictionary[i])
p.style.font.bold = True #bolds entire thing not just dictionary name
# Save in current repository
doc.save(fileName + ".docx")
os.system(fileName + ".docx")
Upvotes: 0
Views: 3887
Reputation: 21
You are adding bolding to your entire paragraph object. Your paragraph object consists of only one run because you used add_paragraph to add "line". You will need to break "line" down into substrings and add the substrings and the authors' names as separate runs in the order you want to the paragraph using add_run. When you add the run with an author's name you need to bold that run at the time you add it. See: [bolding python-docx documentation][1], [runs documentation][2], and [add_run documentation][3].
python-docx does not have a paragraph or run deleter so doing it this way leaves behind an empty paragraph. You should clean out the style on the paragraph as well. As your code illustrated, you were treating all the runs in the paragraph as a single text object so this is the solution I provided. All of the formatting on the runs in the original paragraph will be lost. A more complete solution would be to: apply the style of the old paragraph to the new paragraph and treat each run of the old paragraph individually, but you didn't ask that question.
import docx
from docx import Document
# create document
doc = Document()
#add a paragraph comprised of a single run
paraText = '''Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Vestibulum
dictum a mauris quis posuere. Proin ac volutpat nulla, sit amet lacinia
sapien.
Aenean et lectus interdum, convallis nisi ultrices, dignissim dui. Donec in
pretium mi.
Vestibulum turpis lorem, convallis et nisl id, aliquam laoreet sem.
Suspendisse erat
justo, faucibus ut eleifend ac, dapibus a nisl. Donec nibh velit, lacinia
at vestibulum
at, ultrices volutpat lorem. Sed eu diam odio. Ut feugiat, turpis eget
tempus malesuada,
libero neque venenatis mi, id vestibulum lectus felis sit amet quam.'''
paragraph = doc.add_paragraph(paraText, style=None)
#Find your string
string2Bold = 'volutpat nulla'
lenString = len(string2Bold)
for oldPara in doc.paragraphs:
if oldPara.text.find(string2Bold) >= 0:
#insert, not add, empty paragraph
newPara = oldPara.insert_paragraph_before(text=None, style=None)
#get all the text from the old paragraph
paraText = oldPara.text
#determine sections of text before and after text to bold
index = paraText.index(string2Bold)
before = paraText[0:index]
after = paraText[index + lenString:]
#reconstruct the paragraph as three runs
newPara.add_run(before)
#add your bolded text
run = newPara.add_run(string2Bold)
run.bold = True
newPara.add_run(after)
#clear out old paragraph, there is no delete or remove that I can find
#in python-docx, leaves behind an empty paragraph with a style attached
oldPara.clear()
doc.save('boldTest.docx')
Upvotes: 2