Jonathan Lam
Jonathan Lam

Reputation: 1328

How to convert a word docx to an image using python

I have a word document that I will need to convert to an image. Basically opening the doc and taking a screencap of its content. Is there any library to do this?

For example: enter image description here

Upvotes: 1

Views: 3480

Answers (1)

Mike67
Mike67

Reputation: 11342

This code will do most of the work. It opens Word, loads the document, takes a screenshot, then closes Word. It maximizes Word and the screenshot is the entire screen. You will probably need to do additional image processing to get the region you want.

import win32com.client as win32
import pyautogui
import win32gui
import time

docfile = 'D:/test.docx'
shotfile = 'D:/shot.png'

def windowEnumerationHandler(hwnd, top_windows):
    top_windows.append((hwnd, win32gui.GetWindowText(hwnd)))
    
word = win32.gencache.EnsureDispatch('Word.Application')
word.Visible = True
word.WindowState = 1  # maximize

top_windows = []
win32gui.EnumWindows(windowEnumerationHandler, top_windows)

for i in top_windows:  # all open apps
   if "word" in i[1].lower(): # find word (assume only one)
       try:
          win32gui.ShowWindow(i[0],5)
          win32gui.SetForegroundWindow(i[0])  # bring to front
          break
       except:
          pass
    
doc = word.Documents.Add(docfile) # open file

time.sleep(2)  # wait for doc to load

myScreenshot = pyautogui.screenshot() # take screenshot
myScreenshot.save(shotfile) # save screenshot

# close doc and word app
doc.Close()
word.Application.Quit()

Upvotes: 2

Related Questions