Reputation: 103
How does one extract attached files from a PDF with itext7?
The sample codes I found for itext5 all don't work any more.
A byte[]
per file would be what I need, as in the itext5 example below:
PdfReader reader = new PdfReader(SRC);
Map<String, byte[]> files = new HashMap<String,byte[]>();
PdfObject obj;
for (int i = 1; i <= reader.getXrefSize(); i++) {
obj = reader.getPdfObject(i);
if (obj != null && obj.isStream()) {
PRStream stream = (PRStream)obj;
byte[] b;
try {
b = PdfReader.getStreamBytes(stream);
}
catch(UnsupportedPdfException e) {
b = PdfReader.getStreamBytesRaw(stream);
}
files.put(Integer.toString(i), b);
}
}
Thx /markus
Upvotes: 0
Views: 4774
Reputation: 77606
You are searching for attachments using brute force instead of by querying the catalog for embedded files and querying page dictionaries for attachment annotations.
Anyway, if I'd port your code to iText 7, it would look like this:
PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC));
PdfObject obj;
for (int i = 1; i <= pdfDoc.getNumberOfPdfObjects(); i++) {
obj = pdfDoc.getPdfObject(i);
if (obj != null && obj.isStream()) {
byte[] b;
try {
b = ((PdfStream) obj).getBytes();
} catch (PdfException exc) {
b = ((PdfStream) obj).getBytes(false);
}
FileOutputStream fos = new FileOutputStream(String.format(DEST, i));
fos.write(b);
fos.close();
}
}
pdfDoc.close();
The only change I made, is that I write the stream to a file.
Upvotes: 2