Reputation: 15239
I try to convert a .rtf file to .docx using pypandoc
import pypandoc
# Specify the input RTF file and output DOCX file
input_file = 'test.rtf'
output_file = 'test.docx'
# Convert the RTF file to DOCX
pypandoc.convert_file(input_file, 'docx', outputfile=output_file)
print(f"Conversion complete. The DOCX file is saved as {output_file}")
However, if I have some colors in the original file, or pictures, they are not keept in the resulting docx, I am missing some settings?
Package Version
----------- -------
windows 10
python 3.11
cobble 0.1.4
lxml 5.3.0
mammoth 1.6.0
pip 23.2.1
pypandoc 1.13
python-docx 0.8.11
pywin32 306
setuptools 65.5.0
Upvotes: 0
Views: 176
Reputation: 22659
The fifth paragraph of the pandoc user guide is probably the part that's cited the most:
Because pandoc’s intermediate representation of a document is less expressive than many of the formats it converts between, one should not expect perfect conversions between every format and every other. Pandoc attempts to preserve the structural elements of a document, but not formatting details such as margin size. And some document elements, such as complex tables, may not fit into pandoc’s simple document model. While conversions from pandoc’s Markdown to all formats aspire to be perfect, conversions from formats more expressive than pandoc’s Markdown can be expected to be lossy.
In other words: those colors won't come through, no matter the settings. Images should work in general, but that's hard to judge without seeing the actual document.
Upvotes: 0