Balázs Németh
Balázs Németh

Reputation: 6647

Java escape HTML

currently I use org.apache.commons.lang.StringEscapeUtils escapeHtml() to escape unwanted HTML tags in my Strings but then I realized it escapes characters with accents to &something;, too, which I don't want.

Do you know any solution for escaping HTML tags but leave my special (well, for some people, they are normal here ;]) letters as they are?

Thanks in advance!

balázs

Upvotes: 38

Views: 74258

Answers (6)

Ahmad AlMughrabi
Ahmad AlMughrabi

Reputation: 1950

I know is too late to adding my comment, but perhaps the following code will be helpful:

public static String escapeHtml(String string) {
    StringBuilder escapedTxt = new StringBuilder();
    for (int i = 0; i < string.length(); i++) {
        char tmp = string.charAt(i);
        switch (tmp) {
        case '<':
            escapedTxt.append("&lt;");
            break;
        case '>':
            escapedTxt.append("&gt;");
            break;
        case '&':
            escapedTxt.append("&amp;");
            break;
        case '"':
            escapedTxt.append("&quot;");
            break;
        case '\'':
            escapedTxt.append("&#x27;");
            break;
        case '/':
            escapedTxt.append("&#x2F;");
            break;
        default:
            escapedTxt.append(tmp);
        }
    }
    return escapedTxt.toString();
}

enjoy!

Upvotes: 6

TheMaskedCucumber
TheMaskedCucumber

Reputation: 2039

This looks very good to me:

org/apache/commons/lang3/StringEscapeUtils.html#escapeXml(java.lang.String)

By asking XML, you will get XHTML, which is good HTML.

Upvotes: 9

andraaspar
andraaspar

Reputation: 886

If you're using Wicket, use:

import org.apache.wicket.util.string.Strings;
...
CharSequence cs = Strings.escapeMarkup(src);
String str =      Strings.escapeMarkup(src).toString();

Upvotes: 0

quietmint
quietmint

Reputation: 14164

Here's a version that replaces the six significant characters as recommended by OWASP. This is suitable for HTML content elements like <textarea>...</textarea>, but not HTML attributes like <input value="..."> because the latter are often left unquoted.

StringUtils.replaceEach(text,
        new String[]{"&", "<", ">", "\"", "'", "/"},
        new String[]{"&amp;", "&lt;", "&gt;", "&quot;", "&#x27;", "&#x2F;"});

Upvotes: 6

goncalossilva
goncalossilva

Reputation: 1860

If it's for Android, use TextUtils.htmlEncode(String) instead.

Upvotes: 21

pingw33n
pingw33n

Reputation: 12510

StringUtils.replaceEach(str, new String[]{"&", "\"", "<", ">"}, new String[]{"&amp;", "&quot;", "&lt;", "&gt;"})

Upvotes: 32

Related Questions