Reputation: 43
Issue: Character encoding in Play! 1.2.4 framework becomes.
Context: We are trying to store the text "《我叫MT繁體版》台港澳專屬伺服器上線!" from input text field to mysql using Play! 1.2.4 framework.
Steps that we followed:
1) UI to get the input from user. just any lang text, so we tried Japneese Char. Note: page is set to UTF-8 character encoding.
2) Post submission to Play! controller, the controller just reads the input and stores it using Play! model. snippet mentiond below,
public static void text_create() throws UnsupportedEncodingException,
ParseException {
System.out.println("params :: text string value :: " + params.get("text"));
String oldString = params.get("text");
// Converting the input string(which is UTF-8 format) and parsing to Windown-1252
String newString = new String(oldString.getBytes(), "WINDOWS-1252");
// 1. passing encoded text to mysql.
// 2. TextCheck table and the column 'text' has encoding and collation format as UTF-8.
// 3. TextCheck > text column mentioned as String in model.
TextCheck a = new TextCheck(newString);
List<Object> text = TextCheck.TextList();
render(a,text);
}
It stores as TEXT value as "《我�MT�體版》�港澳專屬伺�器上線�"
Problem is there are character � in between value. when i read this raw data from mysql using other platforms like java, ruby or some other language it converts but makes those � characters as junk. just junk.
Note: Interstingly when i read it from same Play! framework. it looks all fine even that junk characters were read correctly.
Question: Why those junk characters ?
Upvotes: 1
Views: 650
Reputation: 69339
The problem is the following line:
String newString = new String(oldString.getBytes(), "WINDOWS-1252");
This looks like nonsense to me. Java stores all strings internally using UTF-16, so you can't adjust the encoding of a Java string in the manner you've attempted here.
The getBytes()
method returns the bytes of the string using the default platform encoding. You then covert these bytes into a new string using a (probably) different charset. The result is almost certain to be broken.
Upvotes: 1