rodrigoalvesvieira
rodrigoalvesvieira

Reputation: 8062

Extract information from a MS Word document via Ruby

is there any way to extract data form a MS Word document using Ruby? I'd only need to know the number of pages of a given document.

I couldn't find a library for this. Do you know of any way to do this?

thanks in advance.

Upvotes: 0

Views: 1095

Answers (2)

Gergo Erdosi
Gergo Erdosi

Reputation: 42043

You can use the yomu gem:

require 'yomu'

data = File.read 'file.docx'
metadata = Yomu.read :metadata, data

puts metadata['Page-Count']

Upvotes: 2

S. A.
S. A.

Reputation: 3754

If you're in Windows, you can use win32ole. You can open the file with:

word = WIN32OLE.new('Word.Application')
word.Visible = true
document = word.Documents.Open('c:\WordDocs\MyWordFile.doc')

And, according to this answer, you could get the number of pages with:

page_count = document.Range.ComputeStatistics(WdStatisticPages)

Upvotes: 1

Related Questions