Reputation: 152
I am trying to convert encrypted documents (doc/docx) into PDF using python.
What I do is:
unoconv -f pdf -eSelectPdfVersaion=1 [path-to-file]
The conversion runs, but I notice that in the doc and docx files there is a change in the appearance of the documents (both the decrypted file and the pdf) which does not affect the original encrypted file (I tested it by simply decrypting the file from a windows client and the decrypted file as it originally was).
The appearance is basically a change in the document style which affects the amount of pages. For example a 13-pages Word document is decrypted into 14-pages of Word document and converted to a PDF file of 14 pages. Similarly a 348-page doc file gets converted into a 330-pages doc file and then a 330-pages PDF file.
I discovered that there is a slight incompatibility of styles between Microsoft Word and the version of LibreOffice installed with Unoconv (4.3). Doing my tests I noticed that fonts get changed to LibreOffice compatible ones that are slightly different in size than the original ones.
I installed a later version of LibreOffice (5.1, 5.3) and in my tests the decrypted doc/docx file had the proper formatting and page numbers, but the unoconv does not utilize the newer version and sticks to 4.3, thus producing the PDF file with incorrect styling and pages number.
I tried to use the:
soffice --headless --convert-to pdf [path-to-file] --outdir [path-to-export-directory]
But it does nothing.
Is there a way to utilize unoconv with a LibreOffice version other than the 4.3?
Is there a way to make the --convert-to command to work with LibreOffice 5.1 or even 5.3?
Upvotes: 3
Views: 5440
Reputation: 305
Here are few steps you could try: Uninstall the older version of libreoffice using
sudo apt remove libreoffice*
Install the latest version of libreoffice using
sudo add-apt-repository ppa:libreoffice/ppa
sudo apt-get update
sudo apt-get install libreoffice
To check if libreoffice is installed successfully type
libreoffice --version
This should return the version number
Next install Microsoft fonts using
sudo apt install ttf-mscorefonts-installer
Also install any other font dependencies that you anticipate your documents could come with
Finally use the below command to convert to pdf. Make sure no libreoffice application is running in the background
libreoffice --headless --invisible --convert-to pdf "test.docx" --outdir files
You should find the pdf in the folder called files
This works on ubuntu 18.04.5 LTS.
Upvotes: 1