John
John

Reputation: 1653

PDF - why is there no standard structure element for a page?

The PDF Spec defines standard structure types, used to define a structure tree for the document. As far as I can see, there is no element related to pages. Here are the standard structure types for grouping elements:

Document
Part
Art
Sect
Div
...and so on...

Why is there no Page item in this list?

If you want your structure to use pages, what should be used? Part? Sect? Div?

Upvotes: 0

Views: 290

Answers (3)

userx
userx

Reputation: 3815

PDF tags exist so that the content type / meaning of elements can be identified. They should be considering a kind of "meta" information for the PDF, simply providing context for the content in a file (so that content can be easily extracted, converted, processed, accessible, etc.). Think of it as a table of contents to a book. Just because the book has x pages doesn't mean that the content structure would be altered if the book's page height was cut in half and now had 2x pages in it.

A Page Object in the PDF Document Structure already groups elements (by nature of each element being on a given page), so doing so in this structure would be a little redundant.

Also, consider this case:

  • Document
    • Table of Contents (Page 1)
    • Section 1 (starts on page 2, ends mid page 3)
      1. Sub Section (page 2)
      2. Sub Section (half of page 3)
    • Section 2 (starts mid page 3)

etc...

In this example, Section 1 and Section 2 couldn't both be direct parents of page 3 (not to mention that Section 1 spans two different pages). Additionally, trying to solve this problem really isn't necessary because the elements which is being grouped here is already each a child of its respective Document Structure's Page node in the actual file format.

Upvotes: 1

mark stephens
mark stephens

Reputation: 3184

The PDF has a tree structure (which is what allows it to load any page so fast). The content does not have any structure unless you choose to use the marked content feature of the format which then allows metadata to be include in the data.

Upvotes: 0

CommonSense
CommonSense

Reputation: 333

Appendix G of the PDF Specification gives examples that demonstrate use of the Page object.

Upvotes: 0

Related Questions