Cryssie
Cryssie

Reputation: 3185

Problems Accessing MS Word 2010 with Python

I am using Python with Eclipse. I need to access MS Word file with Python. I have seen some examples on this and I have already installed pywin32. I tried some of the examples but I am getting some errors.

import win32com.client as win32

word = win32.Dispatch("Word.Application")
word.Visible = 0
word.Documents.Open("myfile.docx")
doc = word.ActiveDocument
print doc.Content.Text
word.Quit()

This is the error I am getting. It would be great if anyone can tell me what I did wrong here.

Traceback (most recent call last):
  File "C:\Users\dino\Desktop\Python27\Test\src\AccessWordDoc.py", line 10, in <module>
    word = win32.Dispatch("Word.Application")
  File "C:\Python27\lib\site-packages\win32com\client\__init__.py", line 95, in Dispatch
    dispatch, userName = dynamic._GetGoodDispatchAndUserName(dispatch,userName,clsctx)
  File "C:\Python27\lib\site-packages\win32com\client\dynamic.py", line 114, in _GetGoodDispatchAndUserName
    return (_GetGoodDispatch(IDispatch, clsctx), userName)
  File "C:\Python27\lib\site-packages\win32com\client\dynamic.py", line 91, in _GetGoodDispatch
    IDispatch = pythoncom.CoCreateInstance(IDispatch, None, clsctx, pythoncom.IID_IDispatch)
pywintypes.com_error: (-2147221005, 'Invalid class string', None, None)

Is there another way to access the MS word file and extract the data in it without going through all this?

Upvotes: 1

Views: 5585

Answers (2)

user3062149
user3062149

Reputation: 4443

The code below worked for me, which is just a simple change of "Word.Application" to "Word.Application.8":

import win32com.client as win32

word = win32.Dispatch("Word.Application.8")
word.Visible = 0
word.Documents.Open("myfile.docx")
doc = word.ActiveDocument
print doc.Content.Text
word.Quit()

I came to this solution following @Torxed's suggestion to examine the registry. When I tried Word.Document.8, the set of methods available did not include .Visible, .Quit, and .Open and so @Torxed's solution did not work for me. (It is clear now that the Application and Word objects are intended to have different uses.) Instead, I also found Word.Application, Word.Application.8, and Word.Application.14 under my registry and just tried Word.Application.8 and it worked as expected.

Upvotes: 2

Torxed
Torxed

Reputation: 23500

The win32 api for calling system api's is great and all but it is a chore. If you're open for the idea and you know you'll be accessing the newer document format by windows (based on XML), that is .docx i'd suggest using a native module such as python-docx.

There's no reason for using the pyWin32 module unless you're going to some very specific tasks.

There's also alternatives for Excel, such as openpyxl

As to your original problem, i'm guessing that the Word you're hooking against is not actually Microsft Word 2013 but rather an unknown or missing application.

Quote Link (This describes youre issue and validates my guess that Word.Application is not actually an application)

You are trying to use a ProgID that does not exist. A "ProgID" is really just a mapping to its CLSID. It sounds like your object is not registering itself correctly.

Look in the registry - all COM objects have their name directly under HKEY_CLASSES_ROOT. Under that name, you will find a CLSID. This CLSID will then have a key under HKEY_CLASSES_ROOT\CLSID. Look at the registry to confirm that the names you tried do not exist as COM objects.

Otherwise, try using the CLSID of the object directly, instead of the ProgID - just pass the IID string directly to Dispatch()

I checked my registry under HKEY_CLASSES_ROOT\CLSID\ and searched for Word standing on that Key (folder). I got:

Key: {00020-0000-0000-0000-00000-0000} titled: Microsoft Word Document
with a sub-folder called ProgID, with the value: Word.Document.8
Which would let me to do:

import win32com.client as win32

word = win32.Dispatch("Word.Document.8")
word.Visible = 0
word.Documents.Open("myfile.docx")
doc = word.ActiveDocument
print doc.Content.Text
word.Quit()

Now, this is an older version of Word, since i don't have Word 2013 or even something fancy as 2010 :) Or i could just enter the KEY which would be 00020-000.... (i think).

A neat lazy-mans workaround Video tutorial here:

Upvotes: 1

Related Questions