marcjohne
marcjohne

Reputation: 191

Converting PDF to PDFA1-A with iTextSharp

I want is to load a plain PDF file in iText and export it (or write it) as a PDF/A1-A.

I've got "iText in action sec. edit" by hand and using iTextSharp. Still Progress == null.

Upvotes: 0

Views: 6869

Answers (3)

Dragos Durlut
Dragos Durlut

Reputation: 8098

Try this.

        PdfStamper pst = null;

        PdfReader reader = new PdfReader(GetTemplateBytes());
        pst = new PdfStamper(reader, Response.OutputStream);

        pst.Writer.SetPdfVersion(PdfWriter.PDF_VERSION_1_4);
        pst.Writer.PDFXConformance = PdfWriter.PDFA1A;

Upvotes: 0

Mark Storer
Mark Storer

Reputation: 15868

Hiya Leonard. (Leonard works for Adobe as their PDF Dev Evangelist Guy. His PDF-Fu is Mighty. I'll refrain from comparing our mightiness in some vague attempt at false modesty.) >:)

Arbitrary PDF -> PDF/A1-A is all but impossible. 1-A requires a great big pile of formatting information embedded in your tagging... about as much info as you'd need to rebuild the PDF as html/css.

Going from "this is a pile of lines and characters with these coordinates" to "this is a table with X columns and Y rows and the following information in its cells" is EXTREMELY DIFFICULT. All but impossible.

PDF/A1-b is much more realistic, though still not easy. You need to put everything into a specific set of colorspaces and render intents and things with molecular structures that your primitive intellect wouldn't understand.

(Terrible misquote, but there's still some funny in there, so I left it.)

iText[Sharp] supports generating PDF/A in as much as it will tell you when you do something blatantly against the spec... but it may not catch it until you call document.close(). The programmer writing the generator still needs to fill in a Whole Bunch of Information "manually".

Ain't nobody that can say "we'll take some arbitrary PDF and turn it into PDF/A-1a" (without lying through their teeth). You point me at some software that says so, and I'll give you a perfectly valid PDF that'll break it. EVERY TIME. I'd bet money on it.

You need a copy of the PDF/A ISO spec ($). You need a copy of the PDF ISO spec (free!). You need to KNOW THEM. And then you'll understand what you're up against.

Now all that is "Arbitrary PDF". If you have a stack of instances of some report that are all coming from the same program, then there's a light at the end of the tunnel. It's still a long tunnel, but the problem degrades to "hard" instead of "almost impossible". And once you've got one report working, handing similar reports FROM THE SAME APP are likely to be relatively easy.

Still not fun.

Upvotes: 7

Leonard Rosenthol
Leonard Rosenthol

Reputation: 89

iText doesn't support conversion of PDF->PDF/A "out of the box".

You could certainly use the low level APIs in the library as a starting point for writing such a converter...but it woudl only be a start...

Upvotes: 1

Related Questions