Tom
Tom

Reputation: 720

Remove Layers/Background from PDF in PHP/Bash/C#

I have some PDF files that I need to modify using a PHP script. I'm also able to exec() so I can use pretty much anything that runs on CentOS.

The PDF files when opened through Adobe Acrobat Pro X, show 2 layers in the "layers" panel:

  1. Background
  2. Color

When I disable both of these layers I end up with a black & white text & images (the text is not vector tho, it's a scanned document).

I want to disable these layers and any other similar layer found in the PDFs using PHP and/or C# or any command-line tool.

Other useful information:

When I run pdfimages (provided with XPDF) on my PDFs, it extracts exactly what I actually need removed from each page...

Additional Information Update: I modified the PDFSharp example here: http://www.pdfsharp.net/wiki/ExportImages-sample.ashx :

Modified:
Line 28: ExportImage(xObject, ref imageCount);

To:
PdfObject obj = xObject.Elements.GetObject("/OC");
Console.WriteLine(obj);

I got the following output in the console for each image:
<< /Name Background /Type /OCG >>
<< /OCGs [ 2234 0 R ] /P /AllOff /Type /OCMD >>
<< /Name Text Color /Type /OCG >>

Which is actually the layer information, and the PDFSharp Documentation for the /OC key:

Before the image is processed, its visibility is determined based on this entry. If it is determined to be invisible, the entire image is skipped, as if there were no Do operator to invoke it.

So now, how do I modify the /OC value to something that will make these layers invisible?

Upvotes: 5

Views: 6788

Answers (1)

Tom
Tom

Reputation: 720

After long hours of experimenting, I found the way! I'm posting the code so someone may find it helpful in the future:

using System;
using System.IO;
using System.Collections.Generic;
using iTextSharp.text;
using iTextSharp.text.pdf;

namespace LayerHide {

    class MainClass
    {
        public static void Main (string[] args)
        {

            PdfReader reader = new PdfReader("test.pdf");
            PdfStamper stamp = new PdfStamper(reader, new FileStream("test2.pdf", FileMode.Create));
            Dictionary<string, PdfLayer> layers = stamp.GetPdfLayers();

            foreach(KeyValuePair<string, PdfLayer> entry in layers )
            {
                PdfLayer layer = (PdfLayer)entry.Value;
                layer.On = false;
            }

            stamp.Close();
        }
    }
}

Upvotes: 11

Related Questions