edm3
edm3

Reputation: 123

RTF to Plain Text in Java

How do you convert an RTF string to plain text in Java? The obvious answer is to use Swing's RTFEditorKit, and that seems to be the common answer around the Internet. However the write method that claims to return plain text isn't actually implemented... it's hard-coded to just throw an IOException in Java6.

Upvotes: 12

Views: 33605

Answers (4)

hitesh007
hitesh007

Reputation: 51

Here is the full code to parse & write RTF as a plain text

    import java.io.FileInputStream;
    import java.io.FileWriter;
    import java.io.IOException;
    import java.io.InputStreamReader;
    import javax.swing.text.BadLocationException;
    import javax.swing.text.Document;
    import javax.swing.text.rtf.RTFEditorKit;

    public class rtfToJson {
    public static void main(String[] args)throws IOException, BadLocationException {
    // TODO Auto-generated method stub
    RTFEditorKit rtf = new RTFEditorKit();
    Document doc = rtf.createDefaultDocument();

    FileInputStream fis = new FileInputStream("C:\\SampleINCData.rtf");
    InputStreamReader i =new InputStreamReader(fis,"UTF-8");
    rtf.read(i,doc,0);
   // System.out.println(doc.getText(0,doc.getLength()));
    String doc1 = doc.getText(0,doc.getLength());


    try{    
           FileWriter fw=new FileWriter("B:\\Sample INC Data.txt");    
           fw.write(doc1);    
           fw.close();    
          }catch(Exception e)
    {
              System.out.println(e);
              }    
          System.out.println("Success...");    
     }    

    }

Upvotes: 0

Jon Iles
Jon Iles

Reputation: 2579

You might consider RTF Parser Kit as a lightweight alternative to the Swing RTFEditorKit. The line below shows plain text extraction from an RTF file. The RTF file is read from the input stream, the extracted text is written to the output stream.

new StreamTextConverter().convert(new RtfStreamSource(inputStream), outputStream, "UTF-8");

(full disclosure: I'm the author of RTF Parser Kit)

Upvotes: 2

morja
morja

Reputation: 8560

I use Swing's RTFEditorKit in Java 6 like this:

RTFEditorKit rtfParser = new RTFEditorKit();
Document document = rtfParser.createDefaultDocument();
rtfParser.read(new ByteArrayInputStream(rtfBytes), document, 0);
String text = document.getText(0, document.getLength());

and thats working.

Upvotes: 21

Aleadam
Aleadam

Reputation: 40391

Try Apache Tika: http://tika.apache.org/0.9/formats.html#Rich_Text_Format

Upvotes: 6

Related Questions