Programmer
Programmer

Reputation: 59

Java get plain Text from RTF

I have on my database a column that holds text in RTF format. How can I get only the plain text of it, using Java?

Upvotes: 5

Views: 12053

Answers (4)

Jens S.
Jens S.

Reputation: 1

This works if the RTF text is in a JEditorPane

String s = getPlainText(aJEditorPane.getDocument());

String getPlainText(Document doc) {
    try {
        return doc.getText(0, doc.getLength());
    }
    catch (BadLocationException ex) {
        System.err.println(ex);
        return null;
    }
}

Upvotes: 0

Ben Arnao
Ben Arnao

Reputation: 543

RTFEditorKit rtfParser = new RTFEditorKit();
Document document = rtfParser.createDefaultDocument();
rtfParser.read(new ByteArrayInputStream(rtfBytes), document, 0);
String text = document.getText(0, document.getLength());

this should work

Upvotes: 2

Mike
Mike

Reputation: 3311

Apache POI will also read Microsoft Word formats, not just RTF.

POI

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public String getRtfText(String fileName) {
   File rtfFile = null;
   WordExtractor rtfExtractor = null ;

   try {
    rtfFile = new File(fileName);

    //A FileInputStream obtains input bytes from a file.
    FileInputStream inStream = new FileInputStream(rtfFile.getAbsolutePath());

    //A HWPFDocument used to read document file from FileInputStream
    HWPFDocument doc=new HWPFDocument(inStream);

    rtfExtractor = new WordExtractor(doc);
   }
   catch(Exception ex)
   {
    System.out.println(ex.getMessage());
   }

    //This Array stores each line from the document file.
    String [] rtfArray = rtfExtractor.getParagraphText();

    String rtfString = "";

    for(int i=0; i < rtfArray.length; i++) rtfString += rtfArray[i];

    System.out.println(rtfString);
    return rtfString;
 }

Upvotes: 0

PeakGen
PeakGen

Reputation: 22995

If you can try "AdvancedRTFEditorKit", it might be cool. Try here http://java-sl.com/advanced_rtf_editor_kit.html

I have used it to create a complete RTF editor, with all the supports MS Word has.

Upvotes: 0

Related Questions