Reputation: 11
I thought my problem would be easy: I wanted to create a PDF document from a template. There are 12 values that I want to fill in. I want to be able to change the template later on (design), so I thought a PDF form would be a good idea. I used the existing Word template and changed it into a PDF form with Adobe Acrobat Pro. My backend application relies entirely on PHP, but I'm open to suggestions.
I've tried FPDI, FPDM, and PDFTK.
Everything is working fine with PDFTK, but I cannot get it to fill in special characters like € ä ü ß ö:
private function createCertificate($certificate, $template): array
{
$pathOutput = $this->basePathOutput.'certificate.pdf'; //DEBUGGING
$pathFdf = $this->basePathFdf.'certificate.fdf'; //DEBUGGING
$fdf = $this->createFDF($certificate);
file_put_contents($pathFdf, $fdf);
$result = [
"exitstatus" => 0,
"pathOutput" => $pathOutput,
];
$command = "pdftk $template fill_form $pathFdf output $pathOutput need_appearances flatten";
//echo $command;
exec($command, $pathOutput, $exitStatus);
$result["exitstatus"] = $exitStatus;
return $result;
}
private function createFDF($certificate): string
{
$fdf = "%FDF-1.2\r\n";
$fdf .= "1 0 obj << /FDF << /Fields[\r\n";
foreach ($certificate as $key => $value) { //$certificate is an array that has the field name for keys and field values for values.
if ($value == "") {
continue;
}
$encodedValue = iconv('UTF-8', 'UTF-8', $value);
$fdf .= "<< /V (".$encodedValue.") /T (".$key.") >>\r\n";
}
$fdf .= "] >> >>\r\n";
$fdf .= "endobj\r\n";
$fdf .= "trailer\r\n";
$fdf .= "<</Root 1 0 R >>\r\n";
$fdf .= "%%EOF\r\n";
return $fdf
}
I've tried encoding it into UTF-16LE. Then, the characters ö,ä, and ü are displayed, but € isn't. Also, there are weird spaces between all the characters.
I suppose there must be an easy solution to this. I'm also open to using another technology as I know pdftk isn't the stuff professionals would use :)
Edit:
I've changed $encodedValue = iconv('UTF-8', 'UTF-8', $value);
to $encodedValue = utf8_decode($value);
Now it works fine with ü
. But €
is displayed as ?
So that's not yet satisfying.
Upvotes: 1
Views: 96
Reputation: 32252
I've changed
$encodedValue = iconv('UTF-8', 'UTF-8', $value);
Converting from one charset to the same charset is a no-op. It does nothing.
to
$encodedValue = utf8_decode($value);
Now it works fine withü
. But€
is displayed as?
So that's not yet satisfying
This points at the correct answer because utf8_decode()
only converts between ISO-8859-1 and UTF-8, and €
is not present in 8859-1. utf8_decode()
and utf8_encode()
have actually been deprecated due to how universally misunderstood their purpose is, and the fact that they are almost never used correctly.
All that aside, €
is present in MS's 8859-1 lookalike/superset encoding cp1252, and is actually one of the few ways to tell them apart. This should solve your issue:
$encodedValue = iconv('UTF-8', 'cp1252', $value);
Where your original data appears to be UTF-8 and the document you're generating is using cp1252.
String encodings are metadata that needs to be tracked alongside the actual data. They cannot be detected reliably, and functions that purport to do so are guessing. Sometimes humans can figure it out based on trial/error/gut feeling, or in this case a single conspicuous character, but under certain conditions a lot of encodings look identical at a glance.
Upvotes: 0