math125
math125

Reputation: 3

converting string of unicode "\u0063" into "c"

I'm doing some cryptoanalysis homework and was trying to write code that does a + b = c. My idea was to use unicode. b +(b-a) = c. Problem is my code returns a the unicode value of c not the String "c" and I can't convert it.

Please can someone explain the difference between the string below called unicode and those called test and test2? Also is there any way I could get the string unicodeOfC to print "c"?

//this calculates the unicode value for c
String unicodeOfC = ("\\u" + Integer.toHexString('b'+('b'-'a') | 0x10000).substring(1));

//this prints \u0063
System.out.println(unicodeOfC);

String test = "\u0063";

//this prints c
System.out.println(test);

//this is false
System.out.println(test.equals(unicodeOfC));

String test2 = "\u0063";
//this is true
System.out.println(test.equals(test2));

Upvotes: 0

Views: 417

Answers (2)

Sotirios Delimanolis
Sotirios Delimanolis

Reputation: 279970

There is no difference between test and test2. They are both String literals referring to the same String. This String literal is made up of a unicode escape.

A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) for the indicated hexadecimal value, and passing all other characters unchanged.

So the compiler will translate this unicode escape and convert it to the corresponding UTF-16 code unit. That is, the unicode escape \u0063 translates to the character c.

In this

String unicodeOfC = ("\\u" + Integer.toHexString('b'+('b'-'a') | 0x10000).substring(1));

the String literal "\\u" (which uses a \ character to escape a \ character) has a runtime value of \u, ie. the two character \ and u. That String is concatenated with the result of invoking toHexString(..). You then invoke substring on the resulting String and assign its result to unicodeOfC. So the String value is \u0063, ie. the 6 characters \, u, 0, 0, 6, and 3.

Also is there any way I could get the string unicodeOfC to print "c"?

Similarly to how you created it, you need to get the numerical part of the unicode escape,

String numerical = unicodeOfC.replace("\\u", "");
int val = Integer.parseInt(numerical, 16);
System.out.println((char) val);

You can then print it out.

Upvotes: 1

Freiheit
Freiheit

Reputation: 8757

I think you're not understanding how string escaping works.

In Java backslash is an escape character that allows you to use characters in strings like newlines \n, tabs \t, or unicode \u0063.

Suppose I am writing code and I need to print a newline. I would do this System.out.println("\n");

Now lets say I want to show a backslash, System.out.println("\"); will be a compile error but System.out.println("\\"); will print \.

So your first string is printing the literal backslash character then the letter u then the hexadecimal number.

Upvotes: 0

Related Questions