MmBaguette
MmBaguette

Reputation: 420

Extracting text from a Google document and get particular page

As of now, I export my Google documents by getting the content from this link:

https://docs.google.com/feeds/download/documents/export/Exportid=DOCUMENT_ID&exportFormat=EXPORT_FORMAT

This works fine, in fact I export my doc to an HTML format then I read from it, but there is no way to know when a page starts or ends.

Here's all the export formats I know of:

HTML, PDF, ODT, TXT, RTF and DOCX

PDF, ODT, RTF and DOCX all indicate separate pages when opened in a renderer. However, after searching for countless APIs for all formats (python-docx, PyPDF4, PyRTF etc), I have not been able to find a working way to read a Google document page by page.

Any suggestions?

Upvotes: 0

Views: 750

Answers (1)

Aerials
Aerials

Reputation: 4419

You could use Apps Script with it you can take advantage of the DocumentApp where you can get PageBreaks.

You could then serve your tailored content as a web app.

Upvotes: 1

Related Questions