Reputation: 186
I want to create a MS word document which compiles a lot of my existing codes ( in MATLAB and Python). I was writing it using python-docx.
If I do something like:
file = open('task1.m', 'r')
document.add_paragraph(file)
Then the code gets written in MS word in simple text format without the formatting.
Is there some way I can write the code while retaining the programming language formatting? (Keeping the colors intact)
Upvotes: 1
Views: 3602
Reputation: 108
For a quick and dirty way to achieve this, NotePad++ has a feature where you can turn on syntax highlighting for your language. Then, select the code, right-click and select "Plugin Commands > Copy Text with Syntax Highlighting". Now, you can paste that into Word and the colors remain.
Upvotes: 2
Reputation: 25489
The .m file doesn't contain color information. That's added by whatever IDE / editor you're using.
If you know (or can find out) how to insert html-formatted or rtf-formatted text into your word document, check out the pygments module.
I'm not sure how you can write this rtf-formatted text into a word document. However, if you write it out to an RTF document, this can be opened and saved by Word.
So let's say I run this code toword.py
:
from docx import Document
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import RtfFormatter
with open("toword.py", "r") as f:
code = f.read()
ht = highlight(code, PythonLexer(), RtfFormatter())
with open("rtffile.rtf", "w") as wf:
wf.write(ht)
doc = Document()
paragraph = doc.add_paragraph(ht)
doc.save("code.docx")
There's also a pygments.lexers.matlab.MatlabLexer
form Matlab files. Or you could use pygments.lexers.get_lexer_for_filename(filename)
to get a lexer from the filename.
Opening rtffile.rtf
in Word:
Opening code.docx
in Word:
Alternatively, you can use the pandoc
module along with its backend. It can convert to the docx format if you supply it some markdown, and can automatically highlight based if the markdown contains code fences.
So with this code:
# from docx import Document
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
import pandoc
with open("toword.py", "r") as f:
code = f.read()
md = f"`````python\n{code}\n`````";
doc = pandoc.Document()
doc.markdown = bytearray(md, encoding="utf-8")
doc.add_argument("out=code.docx")
doc.docx
we get the following code.docx
:
You can play with the highlight style using the --highlight-style=...
argument. More info here
Upvotes: 3