Reputation: 1297
I have libreoffice 6.4 installed in my ubuntu 18.04 container.
The goals is to convert a pdf file to docx.
I have already tried these commands :
libreoffice --headless --convert-to docx:"Microsoft Word 2007/2010/2013 XML" /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to docx:"Microsoft Word 2007-2013 XML" /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to docx:"MS Word 2007 XML" /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to docx:writer_MS_Word_97 /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to "docx:writer_MS_Word_2007" /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to docx:writer_OOXML /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to doc /pdf/pdf.pdf --outdir /pdf
libreoffice --headless --convert-to "docx:writer_MS_Word_2007" --outdir /pdf pdf.pdf
But they always return this message :
convert /pdf/pdf.pdf -> /pdf/pdf.docx using filter : writer_MS_Word_2007
Overwriting: /pdf/pdf.docx
Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///pdf/pdf.docx> failed: 0x81a(Error Area:Io Class:Parameter Code:26))
Can anyone give me a clue on what's going on?
UPDATE :
I tried this command :
libreoffice --infilter="writer_pdf_import" --convert-to docx --outdir /pdf /pdf/pdf.pdf
and it returned this message :
convert /pdf/pdf.pdf -> /pdf/pdf.docx using filter : Office Open XML Text
Overwriting: /pdf/pdf.docx
I can see it needs the --infilter--
parameter, since the input file is a pdf.
But, it's using Office Open XML Text
filter, I need to switch it to Microsoft Word 2007-2013 XML
, how can I do that?
I already tried these and not working :
libreoffice --infilter="writer_pdf_import" --convert-to docx:"Microsoft Word 2007-2013 XML" --outdir /pdf /pdf/pdf.pdf
libreoffice --infilter="writer_pdf_import" --convert-to "docx:Microsoft Word 2007-2013 XML" --outdir /pdf /pdf/pdf.pdf
libreoffice --infilter="writer_pdf_import" --convert-to "docx:writer_MS_Word_2007" --outdir /pdf /pdf/pdf.pdf
libreoffice --infilter="writer_pdf_import" --convert-to docx:"writer_MS_Word_2007" --outdir /pdf /pdf/pdf.pdf
libreoffice --infilter="writer_pdf_import" --convert-to docx:writer_MS_Word_2007 --outdir /pdf /pdf/pdf.pdf
they always return this message (same as above) :
convert /pdf/pdf.pdf -> /pdf/pdf.docx using filter : writer_MS_Word_2007
Overwriting: /pdf/pdf.docx
Error: Please verify input parameters... (SfxBaseModel::impl_store <file:///pdf/pdf.docx> failed: 0x81a(Error Area:Io Class:Parameter Code:26))
Upvotes: 5
Views: 9766
Reputation: 1297
I finally figured out the workaround.
Hopefully, this will be useful for anyone having the same issues.
I did an experiment, by trying the possible word filters one by one from this list, there are 4 successful attempts.
libreoffice --headless --infilter="writer_pdf_import" --convert-to docx --outdir /pdf /pdf/pdf.pdf
libreoffice --headless --infilter='writer_pdf_import' --convert-to docx:"MS Word 2007 XML" --outdir /pdf /pdf/pdf.pdf
libreoffice --headless --infilter='writer_pdf_import' --convert-to doc:"MS Word 2007 XML" --outdir /pdf /pdf/pdf.pdf
libreoffice --headless --infilter="writer_pdf_import" --convert-to doc --outdir /pdf /pdf/pdf.pdf
But between those 4 commands, the last one yields the best result, the converted document file content looks similar compared to the original one, FYI, my document has some chinese characters and tables, the first 3 commands didn't draw the table borders correctly, while the last one did.
UPDATE :
I decided to install libreoffice 7.0 on ubuntu 18.04 container.
To see the detailed list of filters, go here, then open one of the xcu files, the filter details should be there, to use it, just pick from the name attribute, and use it like this :
libreoffice --headless --infilter='writer_pdf_import' --convert-to doc:"<enter_filter_name_here>" --outdir /pdf /pdf/pdf.pdf
Upvotes: 11