Sulav Malla
Sulav Malla

Reputation: 93

Extract text of certain font face from a docx file

I am using python 3.4 along with python-docx library to work on .docx files. I have been able to extract text from the document. But my objective is to extract only those text with certain font (and modify them).

I have been searching for this in the library documentation for the past two days with no result.

Does anybody here have experience with this library, if so could they point me in the right direction.

Upvotes: 4

Views: 2412

Answers (1)

scanny
scanny

Reputation: 28903

At present, python-docx only has the ability to apply a font typeface using a style. You can detect runs having a particular style like this:

document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    for run in paragraph.runs:
        if run.style == style_I_want:
            print run.text

If the special fonts are applied using a paragraph style you could use this:

document = Document('having-fonts.docx')
for paragraph in document.paragraphs:
    if paragraph.style == style_I_want:
        print paragraph.text

If you can say more about the particulars I may be able to be more specific.

Upvotes: 2

Related Questions