Reputation: 11
My PDF contains "%PDF-1.3" in header. It means PDF Version is 1.3 ,But Adobe reader( XI) installed on my system shows version as 1.5 if looking it in File > Properties.
What is right?
1.3 or 1.5?
I can get PDF version as 1.3 by reading PDF metadata in java. How can I get PDF version 1.5 through java program?
Upvotes: 0
Views: 1348
Reputation: 1
use gostscript to convert your file. For that this is the Linux comand:
gs -o tempPdfFilePath -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 pdfFilePath && mv tempPdfFilePath pdfFilePath
Note that you cannot read and write on the same file, so you need a temp file name.
Upvotes: 0
Reputation: 95918
The version in the file header can be overridden later in the file, cf the PDF specification:
Beginning with PDF 1.4, the Version entry in the document’s catalog dictionary (located via the Root entry in the file’s trailer, as described in 7.5.5, "File Trailer"), if present, shall be used instead of the version specified in the Header.
(section 7.5.2 File Header)
Thus,
What is right?
depends on the PDF contents. If you are not sure, please share your PDF for analysis.
Concerning questions from comments...
(1) I don’t find anything like 1.5 on pdf opening with notepad still it shows Version as 1.5. , Version would be in encoded form?
No, but it would be a name, not a number:
The value of this entry shall be a name object, not a number, and therefore shall be preceded by a SOLIDUS (2Fh) character (/) when written in the PDF file (for example, /1.4).
(Table 28 – Entries in the catalog dictionary)
So a search for "1.5" should find it. Unless, that is, compressed object streams (a PDF 1.5 feature) are used and the newest catalog has been put into such an object stream.
(2) Is there any pdf-api available in java to read such version entries.
You can read the entry using any library allowing access to its low level routines, e.g. iText, PDFBox, PDFClown, ...
(3) If Yes, how to ?
In iText for a PdfReader reader
:
reader.getCatalog().getAsName(PdfName.VERSION)
In PDFClown for a Document document
:
document.getVersion()
while the original header version is retrieved from a File file
using:
file.getVersion()
(PDFClown information proposed by Stefano Chizzolini)
(4) Would you please let me know what type of content I should check to detect pdf’s actual version?
Usually checking the header and the catalog should suffice.
Probably, though, some programs, when spotting the use of a PDF feature only present in later PDF specifications, return the smallest PDF specification version in which all used features are present. In that case you'd have to check all the reachable PDF content.
This would especially make sense for cross reference and object streams introduced in 1.5.
Also If I edit header PDF header with version 1.6, It shows version as 1.6, so it means Adobe dosent display property overridden by Version entry in the document’s catalog dictionary, It takes later version from both of these.
That's correct, and it is also mentioned in the specification of the Version catalog entry:
The version of the PDF specification to which the document conforms (for example, 1.4) if later than the version specified in the file’s header (see 7.5.2, "File Header"). If the header specifies a later version, or if this entry is absent, the document shall conform to the version specified in the header.
(Table 28 – Entries in the catalog dictionary)
Concerning the provided screenshot
The OP provided a screenshot:
One can clearly see that the file in question is linearized (on the left side one can see the linearization parameter dictionary and on the right side this is confirmed by "Fast Web View: Yes"). Following the linearization parameter dictionary there are the cross references for the first page, and these cross references are provided as a cross reference stream, not a cross reference table.
Cross reference streams have been introduced in PDF 1.5, and PDFs using cross reference streams instead of cross reference tables cannot even be parsed according to the PDF 1.4 and 1.3 references.
I assume that Adobe Reader claims a version 1.5 because of this unparsability according to specifications before 1.5.
I think, I would not be able to fetch 1.5 as version from PDF with other API. Is it so?
I assume so, at least immediately; many libraries may hide such details (like whether cross reference streams or tables are used) from the user. As you have not provided the PDF in question, though, this is a mere assumption.
What solution I should provide to my customer? I have been working in Publishing domain segment. Working in an application developed in java, we do have the validation check : System must not allow PDF version 1.3 and before.
That requirement already is not well defined. What is a PDF version 1.3 and before?
Is it a PDF file which does claim to be 1.3 or before?
As a special case, what about PDFs claiming different versions? E.g. different entries in header and catalog, or different entries in different incremental updates. Is such a PDF 1.3 or before if one of the differing entries is 1.3 or before? Or only if all are 1.3 or before? Or does the newest catalog version entry need to be 1.3 or before?
Is it a PDF file which a chosen indicator program (e.g. Adobe Reader in a fixed version) recognizes as 1.3 or before?
Is it a PDF which is valid according to a PDF reference 1.3 or before?
Or is it a PDF which is not valid according to any PDF reference 1.4 and after?
The only thing easy to implement is the first variant (having decided on the special cases), but what customers from the publishing context most likely mean is something along the lines of the last variant.
We check pdf version using PDF Tool Box-java jar. Which gives pdf version as 1.3 ,So validation gets failed. Client is questioning that its right pdf showing a screen shot from opening PDF, File > Properties. Now, what should be the next step?
The next step? Get together with the customer and get to a common understanding what a PDF version 1.3 and before means. And then reconsider if you still want to implement that. It might be a matter of some person years.
Upvotes: 1