Reputation: 41
I have tried using the python-docx module. So far, I have been able to extract specific paragraphs and also the whole text out of the word file.
pip install --pre python-docx #to install python-docx
from docx import Document
document = Document('file.docx')
document.paragraphs # to extract paragraphs
document.paragraphs[2].text # gives the text
for par in document.paragraphs: # to extract the whole text
print(par.text)
# I tried the below code to find some specific term
for i in range(0, 50, 1):
if (document.paragraphs[i].text == ('Some-word')):
print document.paragraph
I expect to find a specific word in a highlighted form in a word file
Upvotes: 1
Views: 5129
Reputation: 5824
It would search through all paragraphs
for par in document.paragraphs: # to extract the whole text
if 'Some-word' in par.text:
print(par.text)
Upvotes: 1