GyRo
GyRo

Reputation: 2646

how can Weblogic send page with Unicode charset

99.9% of the pages in my application are using UTF-8 encoding.

However for some special usecase in the client side, I need one of them to use Unicode (2 bytes for each character)

For that matter the header of this page is:

<%@ page language="java" contentType="text/html; charset=unicode"%>
...<my content>...

This implementation works fine and do the job, when the application is run on Tomcat and Webspher. However when it is deployed on Weblogic, I get the server error: unsupported encoding: 'unicode': java.io.UnsupportedEncodingException: unicode

Does someone know how I can force Weblogic to send pages in 'Unicode' encoding?

Upvotes: 1

Views: 4183

Answers (2)

BalusC
BalusC

Reputation: 1108782

UTF-8 is Unicode. "Unicode" is not a character encoding at its own, it is a character mapping standard (a charset). Your problem lies somewhere else. Maybe you've had problems with GET request encoding. This is often overlooked. You may then find this article useful to get more background information and complete solutions how to get the Unicode phenomenon to work in a Java EE webapplication: Unicode - How to get the characters right?

Good luck.

By the way, the "2 bytes per character" is characteristic for the majority of the UTF-16 encoding (0x0000 until with 0xFFFF are represented in 2 bytes, while UTF-8 uses 1, 2 and 3 bytes for each of the subranges). Maybe you just meant to use it instead?

Upvotes: 3

Damien B
Damien B

Reputation: 2003

Unicode is not a charset, but there are charsets allowing to represent characters to be represented in the Unicode system. You know already the UTF-8 charset, which encodes each character with 1, 2, 3 or 4 bytes, depending on the position of the character in the system. It seems that you want to use the UTF-16 charset, which encodes each character with 2 or 4 bytes.

Note related to the answer provided by BalusC: here I use the word "charset" as "denominator for the character set encoding part in the Content-Type MIME header". Strictly speaking, the Universal Character Set provided by Unicode is a character set, but we don't strictly specify a character set with the charset moniker.

Upvotes: 1

Related Questions