William Jockusch
William Jockusch

Reputation: 27295

View the innards of a .ppt file?

I need to figure out what is going on inside a client's .ppt files. What is a good way to get started?

My eventual hope is to convert it to HTML. But if I just export the .ppt to HTML, I get a lot of images (as opposed to text), which is not a Good Thing.

EDIT: software that automatically converts .ppt to HTML would be terrific, provided that it preserves as much information as possible in text format. If that doesn't exist, the next best thing would be to understand the innards of the .ppt and write my own code to do a partial conversion.

EDIT: I used OfficeConvert as recommended by Michiel Leenaars. It got me text all right. My 50-page, 8MB test file turned into 40MB of text. The fact that I got text is good. The fact that the amount went way up is moving in the wrong direction. And there is an awful lot of repetition in there. The word "style" appeared 410815 times; the word "draw" appeared 351229 times.

Upvotes: 1

Views: 225

Answers (3)

user489041
user489041

Reputation: 28304

If you know Java, Apache has the POI project which lets you take a look at the inners of a PPT project. Could get all the info you want about the project (images, text) and then convert it to html however you like.

Its free too.

Upvotes: 0

Kate Gregory
Kate Gregory

Reputation: 18954

I like the Aspose products. (I'm not associated with them other than as a customer.) I've used the PPT one specifically to write code that pokes around in the insides of a PPT. Overkill if you just want to convert it to HTML, but invaluable for the sorts of things I use it for.

Upvotes: 0

Michiel Leenaars
Michiel Leenaars

Reputation: 46

I think a safe way would be to use OfficeConvert to automatically convert to ODF programmatically with Microsoft Office. Run it with /? to get help. There are some dependencies (see below).

Then use a good ODF library like lpod to look inside it.

You can view some interesting code examples here.


Dependencies:

Upvotes: 3

Related Questions