Atefeh Rashidi
Atefeh Rashidi

Reputation: 515

run.getFontFamily() returns null for .docx files using Apache POI

I want to read a .docx file paragraph by paragraph and I want to check font-family, font-size, margin, alignment, color and etc. for each paragraph. This is an example of my .docx file:

Sample of .docx file

And this is my code:

FileInputStream fis = new FileInputStream("D:/test3.docx");
XWPFDocument docx = new XWPFDocument(fis);
List<XWPFParagraph> paragraphList = docx.getParagraphs();
for (int i = 0; i < paragraphList.size(); i++) {
            System.out.println("paragraph " + i + " is::    " + paragraphList.get(i).getText());
            for (XWPFRun run : paragraphList.get(i).getRuns()) {
                System.out.println("paragraph :: run text is::    " + run.text());
                System.out.println("paragraph :: run color is::    " + run.getColor());
                System.out.println("paragraph :: run font-famyly is::    " + run.getFontFamily()); //It always return null; why?
                System.out.println("paragraph :: run font-name is::    " + run.getFontName()); //It always return null; why?
                System.out.println("paragraph :: run text position is::    " + run.getTextPosition()); //It always return -1; why?
                System.out.println("paragraph :: run font-size is::    " + run.getFontSize());
                System.out.println("paragraph :: run IsBold::    " + run.isBold());
                System.out.println("paragraph :: run IsItalic::    " + run.isItalic());

            }}

But fontFamily(for each font-family that I choose), fontName, textPosition are always null. I have another code sample to do this :

            XWPFStyles styles = docx.getStyles();
        for (int i = 0; i < paragraphList.size(); i++) {
            System.out.println("paragraph " + i + " styleID  is::    " + paragraphList.get(i).getStyleID());
            if (paragraphList.get(i).getStyleID() != null) {
                String styleid = paragraphList.get(i).getStyleID();
                XWPFStyle style = styles.getStyle(styleid);
                if (style != null) {
                    System.out.println("style name is::    " + style.getName());
                    if (style.getName().startsWith("heading")) {
                        System.out.println("This part of text is heading!!");
                    }
                }

            }
        }

but style is usually null except for headings.

Upvotes: 1

Views: 1265

Answers (3)

Atefeh Rashidi
Atefeh Rashidi

Reputation: 515

I get out that fonts which I wanted to check were not standard!! So, for another standard fonts, all of above code works perfect.

Upvotes: 0

Gayan Kavirathne
Gayan Kavirathne

Reputation: 3237

Here is a sample code to get the styles from the style.xml using apache POI.

     XWPFDocument docx;                                                 // Set the docx
       XWPFRun run;                                                    //get the required run
       String fontFamily= run.getFontFamily();
        if(fontFamily == null){                                         // When the font in the run is null check for the default fonts in styles.xml
            String styleID = run.getParagraph().getStyleID();
            XWPFStyle style = docx.getStyle(styleID);
            CTStyle ctStyle = style.getCTStyle();
            CTRPr ctrPr = ctStyle.getRPr();
            CTFonts ctFonts = ctrPr.getRFonts();
            if(ctFonts!= null){
                fontFamily = ctFonts.getAscii();               // Or you may getCs() , getAnsi() etc.
            }
//                        else {
//                            fontFamily = ctStyle.getPPr().getRPr().getRFonts().getAscii();
//                        }
//                        System.out.println();
        }
        return fontFamily;

Hope this would be helpfull.

Upvotes: 2

Mohamed Shakeel
Mohamed Shakeel

Reputation: 360

Apache POI parses the document.xml part of a .docx file. When you do a run.getFontFamily(), it will return the font family only if it is present in the run properties of the run. Otherwise it will return null. For example, consider this sample run

<w:r>
    <w:rPr>
        <w:lang w:val="en-US"/>
    </w:rPr>
    <w:t>The quick brown fox jumps over the lazy dog.</w:t>
</w:r>

This does have its font family specified in the <w:rPr> run properties tag. In cases like this, you have to go up the hierarchy and see if the paragraph which has this run has a style. If even the <w:Pr> paragraph properties does not have a style, then the font family which is default to the document is applied. The document defaults are specified in the styles.xml file.

Upvotes: 1

Related Questions