Reputation: 11774
Python Docx
is a pretty good library for generating Microsoft Word documents for something that doesn't directly deal with all of the COM stuff. Nonetheless, I'm running into some limitations.
I want a paragraph to have multiple lines without there being extra space between them. However, writing out a string that separates the lines with the usual \n
is not working. Nor is using 

or 
. Any other thoughts, or is this framework too limited for something like that?
Upvotes: 11
Views: 22462
Reputation: 4538
You can achieve your carriage return using python-docx by calling add_break()
on your run. For example:
doc = Document()
p = doc.add_paragraph()
run = p.add_run()
run.add_break()
Upvotes: 8
Reputation: 28903
As of v0.7.2, python-docx translates '\n' and '\r' characters in a string to <w:br/>
elements, which provides the behavior you describe. It also translates '\t' characters into <w:tab/>
elements.
This behavior is available for strings provided to:
Document.add_paragraph()
Paragraph.add_run()
and for strings assigned to:
Paragraph.text
Run.text
Upvotes: 2
Reputation: 14938
I'm not sure if this is possible. It looks as though Word is in fact treating presses of the enter key (I am treating this action as a sort of programmatic equivalent of "\r\n" and "\n") as the creation of a new paragraph.
If I record a macro in Word that consists of:
I get VBA of:
Selection.TypeText Text:="One"
Selection.TypeParagraph
If I create a Word document that looks like this (pressing enter after each word):
One
Two
Three
The body of that document looks like this in the documents.xml
file:
<w:body>
<w:p w:rsidR="00BE37B0" w:rsidRDefault="00CF2350">
<w:r>
<w:t>One</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00CF2350" w:rsidRDefault="00CF2350">
<w:r>
<w:t>Two</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00CF2350" w:rsidRDefault="00CF2350">
<w:r>
<w:t>Three</w:t>
</w:r>
</w:p>
<w:sectPr w:rsidR="00CF2350" w:rsidSect="001077CC">
<w:pgSz w:w="11906" w:h="16838"/>
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440" w:header="708" w:footer="708" w:gutter="0"/>
<w:cols w:space="708"/>
<w:docGrid w:linePitch="360"/>
</w:sectPr>
</w:body>
From MSDN we can see that the <w:p>
element represents a paragraph.
I think the solution to this would be to follow the example in Python Docx:
body.append(paragraph("Hi."))
body.append(paragraph("My name is Alice."))
body.append(paragraph("Let's code"))
Or:
for paragraph_text in "Hi. \nMy name is Alice.\n Let's code".split("\n"):
body.append(paragraph(paragraph_text.strip()))
Edit:
Looking into this some more, if you press Shift + Enter in Word it adds a manual line break (not a paragraph) via appending Chr(11)
. In Open XML, this translates to a Break.
Looking at the docx.py
file of Python Docx, something like this might be the way to go (disclaimer: not tested):
for text in "Hi. \nMy name is Alice.\n Let's code".split("\n"):
run = makeelement('r')
run.append(makeelement('t', tagtext=text))
run.append(makeelement('br'))
body.append(run)
Upvotes: 8