Reputation: 1
My goal is to covert a pdf into a file that fits the factur-x format.
I successfully converted a pdf into pdfA/3-b Here's the code:
import subprocess
gs_path = r"C:\Program Files\gs\gs10.02.1\bin\gswin64.exe"
def convert_to_pdfa(input_path, output_path, pdfa_def_path):
command = [
gs_path,
"-dPDFA=3",
"-dBATCH",
"-dNOPAUSE",
"-sColorConversionStrategy=UseDeviceIndependentColor",
"-sDEVICE=pdfwrite",
"-sOutputFile=" + output_path,
"-dPDFACompatibilityPolicy=2",
pdfa_def_path,
input_path
]
subprocess.run(command)
if __name__ == "__main__":
input_pdf_path = "facture.pdf"
output_pdfa_path = "output_pdfa.pdf"
pdfa_def_path = "PDFA_def.ps"
convert_to_pdfa(input_pdf_path, output_pdfa_path, pdfa_def_path)
Here's the code in the PDFA_def.ps file:
% Define entries in the document Info dictionary :
/ICCProfile (sRGB_v4_ICC_preference.icc)
def
[ /Title (test)
/DOCINFO pdfmark
% Define an ICC profile :
[/_objdef {icc_PDFA} /type /stream /OBJ pdfmark
[{icc_PDFA} <</N systemdict /ProcessColorModel get /DeviceGray eq {1} {4} ifelse >> /PUT pdfmark
[{icc_PDFA} ICCProfile (r) file /PUT pdfmark
% Define the output intent dictionary :
[/_objdef {OutputIntent_PDFA} /type /dict /OBJ pdfmark
[{OutputIntent_PDFA} <<
/Type /OutputIntent % Must be so (the standard requires).
/S /GTS_PDFA1 % Must be so (the standard requires).
/DestOutputProfile {icc_PDFA} % Must be so (see above).
/OutputConditionIdentifier (sRGBv4 ICC preference)
/PUT pdfmark
% Embed XML file:
[ /_objdef {InvoiceStream} /type /stream /OBJ pdfmark
[ {InvoiceStream} << /Type /EmbeddedFile /Subtype (text/xml) cvn /Params << /ModDate (D:20130121081433+01’00’) >> >> /PUT pdfmark
[ {InvoiceStream} (output.xml) (r) file /PUT pdfmark
[ {InvoiceStream} /CLOSE pdfmark
[ /_objdef {Invoice_FSDict} /type /dict /OBJ pdfmark
[ {Invoice_FSDict} << /Type /FileSpec /F (output.xml) /UF (output.xml) /Desc (ZUGFeRD XML invoice) /AFRelationship /Alternative /EF << /F {InvoiceStream} /UF {InvoiceStream} >> >> /PUT pdfmark
[ /_objdef {AFArray} /type /array /OBJ pdfmark
[ {AFArray} {FSDict} /APPEND pdfmark
[ {Catalog} << /AF {AFArray} >> /PUT pdfmark
[ /Name (output.xml) /FS {FSDict} /EMBED pdfmark
[
/XML
(
...
)
/Ext_Metadata pdfmark
I followed this tutorial on the zugferd blog
When I open the pdf, there's no attached xml file: There is no xml files attached
I compared the pdf I rendered with a pdf that follows the factur-x format
the pdf I rendered:
46 0 obj
<</Type/Metadata
/Subtype/XML/Length 1294>>stream
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<?adobe-xap-filters esc="CRLF"?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 2.9.1-13, framework 1.6'>
...
</x:xmpmeta>
<?xpacket end='w'?>
endstream
endobj
valid pdf:
8 0 obj
<<
/Filter /FlateDecode
/Subtype /XML
/Type /Metadata
/Length 978
>>
stream
... binary data ...
endstream
endobj
Upvotes: 0
Views: 592
Reputation: 1694
For anyone having problem producing correct output, be sure to use correct version of zugferd.ps. Do not download it somewhere from internet, but take the one provided with your installed version of ghostscript.
On windows it is under %programfiles%\gs\gs<version>\lib
Also take rgb profile from %programfiles%\gs\gs<version>\iccprofiles
Upvotes: 0
Reputation: 11739
For Windows users struggling to ensure the syntax is working for them use this as a template command then adapt slowly until finally working when you can add the -q (if desired).
You need from the installed GS files a copy of
invoice-0001
)The result should be invoice-0001-xml.pdf
and the stated size 0 bytes as not a pdf.
gswin##c --permit-file-read="%CD%/" -sDEVICE=pdfwrite -dPDFA=3 -sColorConversionStrategy=RGB -sZUGFeRDProfile="%CD%\rgb.icc" -sZUGFeRDVersion=2p1 -sZUGFeRDConformanceLevel=BASIC -sZUGFeRDXMLFile="%CD%\invoice-0001.xml" -o"%CD%\invoice-0001-xml.pdf" zugferd.ps "%CD%\invoice-0001.pdf"
NOTES
gswin##c
Will be the correct installed .exe for your system or user environmental paths where ## is either 32 or 64"%CD%/"
The Current work Directory where all the InOut files are suggested to be together (while testing as you can replace %CD% after testing) and beware only for the permissions it MUST be forward slash terminated !Once you trust a zero Length file it can run anything suitable.
So the file will usually run as a File in Edge.
Upvotes: 1
Reputation: 3417
I see not that your subprocess followed the command.ine description of ghostwriter -> here:
gs --permit-file-read=/usr/home/me/zugferd/ -sDEVICE=pdfwrite -dPDFA=3\
-sColorConversionStrategy=RGB -sZUGFeRDXMLFile=/usr/home/me/zugferd/invoice.xml\
-sZUGFeRDProfile=/usr/home/me/rgb.icc -sZUGFeRDVersion=2p1 -sZUGFeRDConformanceLevel=BASIC\
-o /usr/home/me/zugferd/zugferd.pdf\
/usr/home/me/zugferd/zugferd.ps /usr/home/me/zugferd/original.pdf
There are also factur-x python libraries on PyPi.
Upvotes: 0