Omar
Omar

Reputation: 567

add_paragraph() in docx adds newline

I am using the following piece of code:

def header_build(self, boldText, dataText):
    # document.add_heading('Document Title', 0)
    p = self.document.add_paragraph()
    p.style = self.document.styles['Body Text']
    p.style.font.size = Pt(12)
    p.style.font.name = 'Times New Roman'
    p.add_run(boldText).bold = True
    p.add_run(dataText)

The idea is to have the following when I use header_build function:

header_build(self, boldText='Owner: ', dataText='Name')

get the following:

Owner: Name

The problem is that I am getting a new line before the text I am trying to get.

Upvotes: 4

Views: 13961

Answers (3)

Walk
Walk

Reputation: 1649

This worked for me :)

import docx

doc = docx.Document()
paragraph = doc.add_paragraph('Lorem ipsum ')
run = paragraph.add_run('dolor')
run.bold = True
paragraph.add_run(' sit amet.')
doc.save('test.docx')

O/P:

Lorem ipsum dolor sit amet.

Upvotes: 1

scanny
scanny

Reputation: 28903

Mimx is quite right. A new document created with Document() contains a single empty paragraph.

This behavior is dictated by Word. When you open a new Word file, if you have paragraph markers on (visible), you'll see the insertion point just before a single paragraph marker. This is reflected in the ISO/IEC 29500 spec for Word in that a document (w:body element) must contain 1 or more paragraphs to be valid. Unfortunately, this means you need to deal with the first paragraph of a new document differently than those you add later.

If you want, you can remove that first paragraph before you start adding content like this:

from docx import Document

document = Document()
document._body.clear_content()

If you save after this call without adding any content, the .docx file will be invalid and may not load or may require a "repair" step. But as long as you add content, this will work fine, and has the advantage that adding paragraphs is uniform, i.e. adding the first paragraph is done the same way as adding later paragraphs.

Otherwise, you need to get the first paragraph and operate on it separately from the rest:

paragraph = document.paragraphs[0]
paragraph.text = 'foobar'
paragraph.style = 'Heading 1'
etc.

for text in content_blocks:
    paragraph = document.add_paragraph()
    paragraph.text = text
    paragraph.style = 'Body Text'

Upvotes: 1

user7609283
user7609283

Reputation:

Problem:

I think that is because you are adding to an existing document (that is empty) that you have created manually (without using python-docx). It seems like when you created the document, paragraphs[0] is created so when you add a new paragraph using add_paragraph() it is creating another paragraph paragraphs[1] leaving the first paragraph empty.

Solution:

There are two solutions:

Either you insert text into paragraphs[0] instead of creating a new paragraph:

def header_build(self, boldText, dataText):

    # paragraph[0]
    p= self.document.paragraphs[0]

    p.style = document.styles['Body Text']
    p.style.font.size = Pt(12)
    p.style.font.name = 'Times New Roman'
    p.add_run(boldText).bold = True
    p.add_run(dataText)
    print p.text

Or you could create a new document using python-docx and then use add_paragraph() that will be paragraphs[0] (no changes to header_build function):

# create a new document
document = Document()

Upvotes: 5

Related Questions