Reputation: 61
I'm trying to look for Python script that could extract text from the first page of a word document. I found functions that could do paragraphs but not pages, which is not what I need.
Upvotes: 4
Views: 1605
Reputation: 224
The problem is, pages in docx format are purely virtual. MS Word decides by itself where and when to put page limiters, based on the text size and another parameters.
It's a little bit easier when user did explicitly set page breaks, as they can be found like it's described there, for example.
As a workaround, you can just calculate the amount of lines per page and trim it by yourself, but as long as I know, there's no "easy" method to do everything with 1 code line.
Upvotes: 2