Martin
Martin

Reputation: 53

Apache POI get line breaks from XWPFRun

I have a problem reading whitespaces from a docx file using Apache POI 3.15. I have a Word document with line breaks in it, when reading the file via apache poi I cannot find a way to get those linebreaks. When I Call paragraph.getParagraphText() the Text is returned with the line breaks. When I iterate over the XWPFRun objects I only get the text and formatting, but no information about line breaks.

This is the code I use. The br,tab,cr and separator lists are always empty.

        XWPFDocument document = new XWPFDocument(fis);
    List<XWPFParagraph> paragraphs = document.getParagraphs();

    for(XWPFParagraph paragraph : paragraphs) {
        //System.out.println(paragraph.getParagraphText());
        for(XWPFRun run : paragraph.getRuns()) {
            CTR ctr = run.getCTR();
            List<CTBr> brList = ctr.getBrList();
            List<CTEmpty> tabList = ctr.getTabList();
            List<CTEmpty> crList = ctr.getCrList();             
            List<CTEmpty> separatorList = ctr.getSeparatorList();
            String text = run.getText(run.getTextPosition());
            String color =run.getColor();
            boolean bold = run.isBold();
            boolean italic = run.isItalic();
            System.out.println("text: " + text + " color: " + color + " bold: " + bold + " italic: " + italic); 

            for(CTEmpty cr : crList) {
                System.out.println(cr);
            }
        }           
    }

Is using the CTR Object to correct way to go or is there another way to get those linebreaks?

Word Example

Upvotes: 2

Views: 3382

Answers (1)

Martin
Martin

Reputation: 53

I found a solution to get the line breaks. Normal enters are returned as own paragraphs without text with a spacingAfter value. Soft enter within a paragraph are returned as breaks via run.getCTR().getBrList

Upvotes: 2

Related Questions