Zohar Peled
Zohar Peled

Reputation: 82474

How can I check if a PDF file is using embedded fonts?

I have a folder where multiple clients upload multiple PDF files. Some of them are using embedded fonts, some doesn't.
I've been working on a service that optimizes (in terms of file size) the PDF files in this folder.
Each user may be uploading around 400 files, weighing anywhere between 80K to 10M, and my task is to optimize all of them to the smallest possible file size with minimal quality lose.

the PDF Library is doing a great job with it. My only problem is that I can't remove all embedded fonts from all files, since some of the files might use these fonts and the result would be a file that I can't use.

So my questions are:

  1. How can I detect what files use and what files doesn't use embedded fonts?
  2. When optimizing the files that use embedded fonts, How can I remove only the unused fonts?

what I want to achieve is to remove all embedded fonts from most of the files, but keep the embedded fonts in the files where I actually need them. I understand that it depends on the fonts I have on my system (these files should stay on a single system so portability is not that important to me), so I try to find a way to identify, before optimizing, what files will look OK without embedded fonts, and what files I need to keep the embedded fonts.

Upvotes: 2

Views: 1516

Answers (2)

Vel Genov
Vel Genov

Reputation: 11091

The Adobe PDF Library version 15 and up have a service that will optimize PDF files for you.

The Optimizer has a function to subset all embedded fonts. What that will do is create a subset of each font limited to only the glyphs of that font actually used by the document. The API is below.

void Datalogics::PDFL::PDFOptimizer::SetOption (OptimizerOption option, bool value)
void Datalogics::PDFL::PDFOptimizer::Optimize (Document document, string newPath)

This is the option that you need

SubsetAllEmbeddedFonts 

Upvotes: 0

Patrick Gallot
Patrick Gallot

Reputation: 625

APDFL has a PDFontIsEmbedded() call. The DotNet interface's Font class has an Embedded property. Saving with the GarbageCollect SaveFlag should remove any unreferenced indirect objects, including fonts.

Note that Resource Dictionaries could potentially be shared by multiple pages so that fonts not used by one page might be used by another page that uses the same resource dictionary.

Upvotes: 0

Related Questions