Reputation: 8350
I have a TextWatcher on an EditText box. When the user types I set whatever is on the EditText Box as a Button label.
EditText et = rootView.findViewById(R.id.userInput);
et.addTextChangedListener(this);
...
@Override public void beforeTextChanged(CharSequence s, int start, int count, int after) {}
@Override public void afterTextChanged(Editable s) {}
@Override
public void onTextChanged(CharSequence s, int start, int before, int count) {
Button btn = (Button) rootView.findViewById(R.id.myButton);
btn.setText(s.toString());
//btn.setText("\u00A9");
}
But I cannot figure out how to write unicode symbols. The commented line, when uncommented, sets the Button
text to the copyright symbol © . Though entering the same unicode code in the EditText box does not work.
I tried to type double backslash in the EditText, still not good.
Note: unrelated to this, when using btn.setText(s)
without the toString()
part, the text in the button is underlined.
CLARIFICATION Due to some comments and answers (now deleted) I realize that I was not clear. Let me rephrase:
I don't want to interfere with the user input text in any way. Right now when the user inputs "Hello \u0089" in the EditText, I copy it to the Button text using this line:
btn.setText(s.toString());
and it shows as "Hello \u0089". I expected "Hello ‰". Why? because if I run a little test and use this line:
btn.setText("Hello \u0089");
it shows as "Hello ‰". So, what is the difference that makes the unicode to show properly in the direct approach but does not show it when is entered through EditText?
Upvotes: 2
Views: 5580
Reputation: 234484
So, what is the difference that makes the unicode to show properly in the direct approach but does not show it when is entered through EditText?
Arggh, I want people to stop saying "the unicode". It's "the text", not "the unicode". Unicode is a standard. The text the user entered is not that one standard, it's just text.
With that out of the way, let's see if I can explain the difference.
When you write a string literal like "Hello \u0089"
in Java, your source code file will contain the following sequence of characters:
There is no magic involved here. What you type is what you get. The \u0089
sequence is not magical.
However, when you give that same source file to your Java compiler, the Java compiler has an agreement with you, the programmer: it will convert any sequence it finds inside a string literal that starts with the characters U+005C U+0075 and is followed by four hexadecimal digit characters into the character that corresponds to the Unicode value specified by those hexadecimal digits. That agreement also includes a provision for when you, the programmer, want to actually meant that sequence with the backslash, the u, and the hexadecimal digits, i.e., six characters, not one. For that you precede the backslash with another backslash, and the Java compiler doesn't perform any other transformation besides removing one of those two backslashes.
So, while the source file will have the string literal with twelve characters between quotation marks, the Java compiler will, following the agreement with the programmer set forth by the Java Specificiation, transform that into a string with only seven characters.
Now, when the user is entering text into some UI, they are not typing in Java string literals that will later be processed by the Java compiler, or are they?
They are not. When the user types a backslash followed by a u and some digits, the user gets a backslash followed by a u and some digits. When the user inputs \u0089
in a text field, that text field holds a string with six characters, not one. There is no Java compiler there with any pre-agreed convention to represent characters by their Unicode values; it's just a user entering text, not Java code.
When the user inputs \u0089
in a text field, the text fields holds a string that can be represented in Java source code as "\\u0089"
, not "\u0089"
.
If you want to give to that kind of user input the same meaning that the Java compiler gives to those Unicode escape sequences, you need to call code that does such a transformation before displaying it.
FOR COMPLETENESS This is the OP posting code I wrote based on the answer above.
public static String convertUnicode(CharSequence s) {
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("\\\\u([0-9a-zA-Z]{4,4})\\b").matcher(s);
while ( m.find() ) {
char c = (char) Integer.parseInt(m.group(1), 16);
m.appendReplacement(result, String.valueOf(c) );
}
m.appendTail(result);
return result.toString();
}
Upvotes: 6