Ivan
Ivan

Reputation: 13

Page break in html-docx converting

I have simple html (temlate) which I convert into docx with docxj4:

<html>

<head>
    <style type="text/css">
    tr,
    h2,
    tnr {
        font-family: Times New Roman;
        font-size: 11pt;
    }

    h2 {
        text-align: center;
    }

    .notesTable {
        border: 4px double black;
        border-collapse: collapse;
        border: 1px solid black;
    }
    </style>
</head>

<body>
    <table align="center" style="width: 75%; margin-left: -25%">
        <tbody>
            <tr style="height: 25px;font-family: 'Times New Roman';font-size: 16pt;">
                <td>28.02.2016 sunday</td>
                <td style="text-align: center; width: 30%;">test</td>
            </tr>
        </tbody>
    </table>
    <div>
        <ol>
            <li>ex1 </li>
            <li>ex2</li>
        </ol>
    </div>
    <p style="text-align: left;">
        <span style="font-family:'Comic Sans MS';">
    test
    </span>
    </p>
    <p>
        <h2>comments</h2> test
    </p>
    <p>
        <h2>contacts</h2> test
    </p>
    <br style="page-break-after: always; clear:both;" />
    <p>
    </p>
</body>

</html>

Problem is in line

<br style="page-break-after: always; clear:both;" />

When it is so, result doc file hasn't page-break. When i changed it to

<br style="page-break-after: always; clear:both;">

page-break appears but I get exception

org.xml.sax.SAXParseException; lineNumber: 142; columnNumber: 3; The element type "br" must be terminated by the matching end-tag "".

and all styles are made default. Please, tell me what I am doing wrong?

import org.docx4j.model.structure.PageSizePaper;
import org.docx4j.openpackaging.exceptions.Docx4JException;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
import org.docx4j.openpackaging.parts.WordprocessingML.AltChunkType;
import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;

import java.io.FileNotFoundException;
import java.io.FileOutputStream;

public class App {
    public static void main(String[] args) throws Docx4JException, FileNotFoundException {
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage(PageSizePaper.A4, false);

        MainDocumentPart mdp = wordMLPackage.getMainDocumentPart();
        String xhtml = "<html>" +
                "<head>" +
                "    <style type=\"text/css\">" +
                "    h2 {" +
                "        text-align: center;" +
                "        font-family: Times New Roman;" +
                "        font-size: 11 pt;" +
                "    }" +
                "    </style>" +
                "</head>" +
                "<body>" +
                "    <h2> Line on the first page</h2>" +
                "    <br style=\"page-break-after: always; clear:both;\" >" +
                "    <h2> Line on the second page</h2>" +
                "</body>" +
                "</html>";
        mdp.addAltChunk(AltChunkType.Xhtml, xhtml.getBytes());
        WordprocessingMLPackage pkgOut = mdp.convertAltChunks();
        FileOutputStream stream1 = new FileOutputStream("test.doc");
        pkgOut.save(stream1);

    }
}

Upvotes: 1

Views: 2178

Answers (1)

Terrence
Terrence

Reputation: 116

I think you need implement the function yourself:

  1. Download XHTMLImporterImpl.java from github

  2. Add logic in method "processInlineBoxContent"(in "br" condition block) like this:

    Br br = Context.getWmlObjectFactory().createBr();
    Attr attrNode = s.getElement().getAttributeNode("style");
    if (attrNode != null && attrNode.getValue().contains("page-break-after: always")) {
      br.setType(STBrType.PAGE);
    }
    

Upvotes: 2

Related Questions