Reputation: 31
I have an xfdf file, which is utf8 and may contain non ASCII characters. I would like to merge it with the pdf that contains the form. I tried with pdftk, and although merging happens correctly - in terms of all fields are being populated - some characters are not appearing in the flattened pdf.
Taking the xfdf:
<?xml version="1.0" encoding="utf-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>
<field name="some_data">
<value>Űző</value>
</field>
<field name="some_other_data">
<value>ùûüÿ€’“”«»àâæçéèêëïôœÙÛÜŸÀÂÆÇÉÈÊËÏÎÔ</value>
</field>
</fields>
</xfdf>
The result pdf's fields have the following values (excluded the quotation marks):
So all the characters in some_other_data are stored correctly, but ő and Ű are stored as 00.
I also realized that if I uncompress the pdf with pdftk, I can find the original characters stored in the pdf as
/DA (/Helv 8.64 Tf 0 g)
/Subtype /Widget
/V (ţ˙ Q z\r )
/T (some_data)
The fact that the correct characters are there is also clear if I open the unflattened form with Adobe Reader. After opening, initially the form field some_data contains only the letter z surrounded with spaces, BUT if I click on the form field, the special characters are revealed, and any changes made to the field value will result in the correct characters to stay visible. On the other hand if I unfocus the form field without any modification, they disappear again..
I also tried to use numeric entities in the xfdf, but it did not help either.
I have 2 questions:
Thank you all!
Upvotes: 2
Views: 222