Anders
Anders

Reputation: 163

How to remove tags from a pdf

I have this pdf where there are tags like this:

62 0 obj
<< /Type /StructElem /S /DokumentNavn /P 56 0 R /K 2 /Pg 58 0 R >>
endobj
60 0 obj
<< /Type /StructElem /S /Bundtekst /P 56 0 R /K 0 /Pg 58 0 R >>
endobj
61 0 obj
<< /Type /StructElem /S /ReferenceLinjer /P 56 0 R /Lang (da) /K 1 /Pg 58 0 R >>
endobj
68 0 obj
<< /Type /StructElem /S /Fritekst /P 56 0 R /K 6 /Pg 58 0 R >>
endobj

I have "removed" them by overwriting them with % However the tool that checks against a whitelist still complains So I'm thinking that maybe the tags are also used in the binary sections of the pdf. Can abcpdf remove tags or is there another solution?

Upvotes: 0

Views: 753

Answers (1)

Bobrovsky
Bobrovsky

Reputation: 14246

Docotic.Pdf library can remove structure information from PDF documents.

Below is a sample code for the task:

public static void saveWithoutStructureInformation(string input, string output)
{
    using (PdfDocument document = new PdfDocument(input))
    {
        document.RemoveStructureInformation();

        document.SaveOptions.RemoveUnusedObjects = true;
        document.Save(output);
    }
}

Disclaimer: I work for the vendor of the library.

Upvotes: 1

Related Questions