Paul Reiners
Paul Reiners

Reputation: 7896

HTMLDocument, HTMLEditorKit, and blank spaces

When I run the following code:

import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;

import javax.swing.text.BadLocationException;
import javax.swing.text.EditorKit;
import javax.swing.text.Element;
import javax.swing.text.html.HTMLDocument;
import javax.swing.text.html.HTMLEditorKit;
    .
    .
    .
        String content = "x";
        String html = "<html><body><dyn/>" + content + "<dyn/></body></html>";
        final Reader reader = new StringReader(html);
        final EditorKit editorKit = new HTMLEditorKit();

        HTMLDocument hTMLDocument = new HTMLDocument();
        editorKit.read(reader, hTMLDocument, 0);
        Element defaultRootElement = hTMLDocument.getDefaultRootElement();
        Element branchElement = defaultRootElement.getElement(1).getElement(0);
        for (int i = 0; i < branchElement.getElementCount(); i++) {
            Element element = branchElement.getElement(i);
            System.out.print(element);
        }

I get the following output:

LeafElement(dyn) 1,2
LeafElement(content) 2,3
LeafElement(dyn) 3,4
LeafElement(content) 4,5

However, if I change the value of content to " ":

    String content = " ";

I get this output:

LeafElement(dyn) 1,2
LeafElement(dyn) 2,3
LeafElement(content) 3,4

Why is a content LeafElement constructed for "x", but not for " "? I want a LeafElement to be constructed for " ". Am I doing something wrong or is this a problem with HTMLDocument or HTMLEditorKit?

Upvotes: 2

Views: 1654

Answers (2)

camickr
camickr

Reputation: 324108

Don't know much about the editor kit but maybe you can use &nbsp; instead of " ".

Upvotes: 1

HQCasanova
HQCasanova

Reputation: 1158

  • I'm hoping for an explanation of why this is happening.

This is just the product of whitespace collapse in HTML. Since that space you're inserting is the only thing between the two <dyn/> tags, it gets ignored by the parser, thus not being represented by a LeafElement.

  • Possible solution

As camickr mentioned, you would have to use non-breaking space entities to preserve all whitespaces. But, since you have no control over the HTML, your best bet is to customise HTMLEditorKit's parser. Perhaps the following resources may come in useful:

Hope this helps!

Upvotes: 1

Related Questions