Reputation: 93
I am using python 3.4 along with python-docx
library to work on .docx
files. I have been able to extract text from the document. But my objective is to extract only those text with certain font (and modify them).
I have been searching for this in the library documentation for the past two days with no result.
Does anybody here have experience with this library, if so could they point me in the right direction.
Upvotes: 4
Views: 2412
Reputation: 28903
At present, python-docx
only has the ability to apply a font typeface using a style. You can detect runs having a particular style like this:
document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
for run in paragraph.runs:
if run.style == style_I_want:
print run.text
If the special fonts are applied using a paragraph style you could use this:
document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
if paragraph.style == style_I_want:
print paragraph.text
If you can say more about the particulars I may be able to be more specific.
Upvotes: 2