pradeep
pradeep

Reputation: 441

Handling Images in XFDF for PDF Annotations

I'm working on a project where we need to manage PDF annotations using XFDF in a IOS mobile app. Currently, the images included in the annotations are encoded as base64 strings, which significantly increases the size of the XFDF data. This not only affects local storage in Core Data but also impacts data transfer when fetching from the server.

I'm looking for alternatives to manage these images more efficiently. Specifically, is it possible to use URLs for images instead of base64 encoding? So, I can download the images in local file system and map the images with their corresponding annotations

Also are there other strategies for handling image data in XFDF that could help reduce storage and transfer overhead? Any advice or examples would be greatly appreciated!

Note: I am looking to use Apryse or PSPDFKit for PDF viewing and markup rendering, where the XFDF string is given to the SDK to render the markups on the PDF.

<?xml version="1.0" encoding=""?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
  <pdf-info xmlns="http://www.pdftron.com/pdfinfo" version="2" import-version="4"></pdf-info>
  <fields></fields>
  <annots>
    <stamp page="0" rect="321.521,562.931,508.2,611.931" flags="print" name="c6029306-80ff-32aa-52e9-827992c5e269" title="Guest" subject="Approved" date="D:20241104112145+05'30'" creationdate="D:20241104112142+05'30'" icon="Approved">
      <trn-custom-data bytes="{"trn-annot-maintain-aspect-ratio":"true","trn-associated-number":"1","trn-unrotated-rect":"321.521,562.931,508.2,611.931"}"></trn-custom-data>
      <imagedata>data:image/png;base64,/rQ7TFbu7Zmc...</imagedata>
    </stamp>
  </annots>
  <pages>
    <defmtx matrix="1,0,0,-1,0,792"></defmtx>
  </pages>
</xfdf>

Upvotes: 1

Views: 128

Answers (2)

K J
K J

Reputation: 11722

Adobe PDF did allow for images outside the PDF as URLs on a remote server HOWEVER that is seriously to be avoided unless using secured inhouse Acrobat readers. See my answer about "Image Stream" here https://superuser.com/a/1809997/1769247

FDF or XFDF are in effect "Redline files" that carry floating data for fields or PDF commentary. Thus are not physically tied to one source file but would be meaningless without its context. The files need to be larger than content due to transport overheads so may be many times larger than a source file. Note whilst you can use native JPEG in a PDF you cannot physically use PNG in PDF, since it must be transformed into either one RGB or additional second alpha PDF encoded image mask.

enter image description here

This should equal your minimal working example .

<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
    <f href="any.pdf"/>
    <annots>
        <stamp icon="Approved" title="You" IT="Stamp" page="0" date="D:20241105" flags="print" name="535f5045-9d95-4000-8c9e94f33be3ad07" rect="321.521,562.931,508.2,611.931" color="#FF00FF">
            <appearance>77u/PERJQ1QgS0VZPSJBUCI+Cgk8U1RSRUFNIEtFWT0iTiI+CgkJPEFSUkFZIEtFWT0iQkJveCI+
CgkJCTxJTlQgVkFMPSIwIi8+CgkJCTxJTlQgVkFMPSIwIi8+CgkJCTxJTlQgVkFMPSIxOTAiLz4K
CQkJPElOVCBWQUw9IjUwIi8+CgkJPC9BUlJBWT4KCQk8TkFNRSBLRVk9IkZpbHRlciIgVkFMPSJG
bGF0ZURlY29kZSIvPgoJCTxJTlQgS0VZPSJMZW5ndGgiIFZBTD0iMTE4Ii8+CgkJPEFSUkFZIEtF
WT0iTWF0cml4Ij4KCQkJPElOVCBWQUw9IjEiLz4KCQkJPElOVCBWQUw9IjAiLz4KCQkJPElOVCBW
QUw9IjAiLz4KCQkJPElOVCBWQUw9IjEiLz4KCQkJPElOVCBWQUw9IjAiLz4KCQkJPElOVCBWQUw9
IjAiLz4KCQk8L0FSUkFZPgoJCTxESUNUIEtFWT0iUmVzb3VyY2VzIj4KCQkJPERJQ1QgS0VZPSJG
b250Ij4KCQkJCTxESUNUIEtFWT0iVGltZXMiPgoJCQkJCTxOQU1FIEtFWT0iQmFzZUZvbnQiIFZB
TD0iVGltZXMtQm9sZCIvPgoJCQkJCTxOQU1FIEtFWT0iRW5jb2RpbmciIFZBTD0iV2luQW5zaUVu
Y29kaW5nIi8+CgkJCQkJPE5BTUUgS0VZPSJTdWJ0eXBlIiBWQUw9IlR5cGUxIi8+CgkJCQkJPE5B
TUUgS0VZPSJUeXBlIiBWQUw9IkZvbnQiLz4KCQkJCTwvRElDVD4KCQkJPC9ESUNUPgoJCTwvRElD
VD4KCQk8TkFNRSBLRVk9IlN1YnR5cGUiIFZBTD0iRm9ybSIvPgoJCTxOQU1FIEtFWT0iVHlwZSIg
VkFMPSJYT2JqZWN0Ii8+CgkJPERBVEEgTU9ERT0iUkFXIiBFTkNPRElORz0iSEVYIj43ODlDNEQ4
QkIxMEVDMjMwMTA0Mzc3N0Y4NTQ3MTgwODc3Q0RCNTY5NDYxMDE1MjM1NTM5RjEwMzUwMTA0ODVE
Q0FDMEVGMTM1OEMwOTZBNQo2N0M5NTY0QUYxN0M4MzdFNjFEODIzRTQyMkFCMzUzMzg4OEEyNTRE
QUQ2NUFFRkVDQjZGRjJGOTlDMjc1NDdDOTU1NEQ0QjZBMTE5RTcxMQo0NzZDMUQ2QkJGNEZFMzkz
NTFFODU3QTg4NjI2RDUyMjkxMUFFOTE3MkMzNjdEM0YxQzRFREQ2RTQ5N0ZBMDczQkMwMUFGQzgx
RDY0PC9EQVRBPgoJPC9TVFJFQU0+CjwvRElDVD4K</appearance>
        </stamp>
    </annots>
</xfdf>

Save the teXtML file above as any.xfdf then open it with Adobe Acrobat Reader. enter image description here

Ensure you have all permissions in place to import to any target file. enter image description here

If everything is correct the Reader will over stamp the base 0 numbered page with the image APPEARANCE (which is not a conventional "base64" URL).

enter image description here

A minimal test image of under 100 bytes will become roughly 1500 as textual FDF and over 3300 as XFDF text as double byte HEX. thus the bigger the input the greater that size increase impacts resources. enter image description here

The XFDF base64 image data translates into a dictionary with imbedded hex encoded data stream so exceptionally wasteful nested bloat in bloat in bloat!!

<DICT KEY="AP">
    <STREAM KEY="N">
        <ARRAY KEY="BBox">
            <INT VAL="0"/>
            <INT VAL="0"/>
            <INT VAL="190"/>
            <INT VAL="50"/>
        </ARRAY>
        <NAME KEY="Filter" VAL="FlateDecode"/>
        <INT KEY="Length" VAL="118"/>
        <ARRAY KEY="Matrix">
            <INT VAL="1"/>
            <INT VAL="0"/>
            <INT VAL="0"/>
            <INT VAL="1"/>
            <INT VAL="0"/>
            <INT VAL="0"/>
        </ARRAY>
        <DICT KEY="Resources">
            <DICT KEY="Font">
                <DICT KEY="Times">
                    <NAME KEY="BaseFont" VAL="Times-Bold"/>
                    <NAME KEY="Encoding" VAL="WinAnsiEncoding"/>
                    <NAME KEY="Subtype" VAL="Type1"/>
                    <NAME KEY="Type" VAL="Font"/>
                </DICT>
            </DICT>
        </DICT>
        <NAME KEY="Subtype" VAL="Form"/>
        <NAME KEY="Type" VAL="XObject"/>
        <DATA MODE="RAW" ENCODING="HEX">789C4D8BB10EC2301043777F8547180877CDB569461015235539F1035010485DCAC0EF1358C096A5
67C9564AF17C837E61D823E422AB3533888A254DAD65AEFECB6FF2F99C27547C9554D4B6A119E711
476C1D6BBF4FE39351E857A88626D522911AE9172C367D3F1C4EDD6E497FA073BC01AFC81D64</DATA>
    </STREAM>
</DICT>

ALL in ALL that could be much smaller as binary FDF or similar PDF object such as:-

27 0 obj
<</Type/XObject/Subtype/Form/BBox[0 0 190 50]/Matrix[1 0 0 1 0 0]/Resources<</Font<</Times 26 0 R>>>>/Length 118/Filter/FlateDecode>>
stream
xœM‹±Â0Cw…Gw͵iF#U9ñPH]ÊÀïXÀ–¥gÉVJñ|ƒ~aØ#ä"«53ˆŠ%M­e®þËoòùœ'T|•TÔ¶¡çGlk¿Oã“QèW¨†&Õ"‘é,6}?NÝnI s¼¯Èd
endstream
endobj

Upvotes: -1

KtoroApryse
KtoroApryse

Reputation: 21

You could use URLs through custom data fetching if, and then setting that as the stamp's image. However, this would introduce a few complications, such as if the URL is unreachable, and actually hosting the image somewhere if its a custom image. This would only work with our viewers, since it would be in custom data. To clarify, XFDF is part of the PDF ISO standard, and it doesn't define support for external URLs.

Upvotes: 1

Related Questions