Reputation: 198
In How can I remove all images from a PDF?, Kurt Pfeifle gave a piece of PostScript code (by courtesy of Chris Liddell) to filter out all bitmaps from a PDF, using GhostScript.
This works like a charm; however, I'm also interested in the companion task of removing everything except bitmaps from the PDF, and without recompressing bitmaps. Or, eventually, separating the vector and bitmap "layers." (I know, this is not what a layer is in PDF terminology.)
AFAIU, Kurt's filter works by sending all bitmaps to a null device, while leaving everything else to pdfwrite
. I read that it is possible to use different devices with GS, so my hope is that it is possible to send everything to a fake/null device by default, and only switch to pdfwrite
for those images which are captured by the filter. But unfortunately I'm completely unable to translate such a thing into PostScript code.
Can anyone help, or at least tell me if this approach might be doomed to fail?
Upvotes: 3
Views: 307
Reputation: 31139
Its possible, but its a large amount of work.
You can't start with the nulldevice and push the pdfwrite device as needed, that simply won't work because the pdfwrite device will write out the accumulated PDF file as soon as you unload it. Reloadng it will start a new PDF file.
Also, you need the same instance of the pdfwrite device for all the code, so you can't load the pdfwrite device, load the nulldevice, then load the pdfwrite device again only for the bits you want. Which means the only approach which (currently) works is the one which Chris wrote. You need to load pdfwrite and push the null device into place whenever you want to silently consume an operation.
Just 'images' is quite a limited amount of change, because there aren't that many operators which deal with images.
In order to remove everything except images however, there are a lot of operators. You need to override; stroke, fill, eofill, rectstroke, rectfill, ustroke, ufill, ueofill, shfill, show, ashow, widthshow, awidthshow, xshow, xyshow, yshow, glyphshow, cshow and kshow. I might have missed a few operators but those are the basics at least.
Note that the code Chris originally posted did actually filter various types of objects, not just images, you can find his code here:
http://www.ghostscript.com/~chrisl/filter-obs.ps
Please be aware this is unsupported example code only.
Upvotes: 4