Reputation: 1201
How can I convert an extended ascii character to its decimal value?
char symbol = '€';
int value = (int) symbol;
I tried the code above but it returned the value = 8364
.
Upvotes: 0
Views: 1139
Reputation: 25458
The following code will decode the euro symbol using all of the character sets your local java installation has available:
import java.nio.charset.Charset;
import java.util.Map;
public class CharsetTest {
public static void main(String[] args) {
String euro = "€";
Map<String, Charset> charsets = Charset.availableCharsets();
for (Map.Entry<String, Charset> entry : charsets.entrySet()) {
Charset cs = entry.getValue();
byte[] bytes;
try {
bytes = euro.getBytes(cs);
} catch (Exception e) {
System.err.println(entry.getKey() + " decode failed");
continue;
}
System.out.print(entry.getKey());
for (String alias : cs.aliases()) {
System.out.print(" " + alias);
}
for (byte bb : bytes) {
System.out.print(" " + bb);
}
System.out.println();
}
}
}
Many charsets return 63 (ASCII "?") for the euro symbol. That's a common substitution for characters that aren't in the character set. The value 128 will appear as -128 because java bytes are signed, so that's the number you're looking for. When I run this, I get -128 for several character sets:
windows-1250 cp1250 cp5346 -128
windows-1252 cp5348 cp1252 -128
windows-1253 cp1253 cp5349 -128
windows-1254 cp1254 cp5350 -128
windows-1255 cp1255 -128
windows-1256 cp1256 -128
windows-1257 cp1257 cp5353 -128
windows-1258 cp1258 -128
x-IBM874 ibm-874 ibm874 874 cp874 -128
x-mswin-936 ms936 ms_936 -128
x-windows-874 ms-874 ms874 windows-874 -128
Using any of those character sets, you could do this and get the expected value:
String euro = "€";
byte[] bytes = euro.getBytes(Charset.forName("charsetname"));
I suspect windows-1252 is the character set that you want, but you could look at the wikipedia pages for the others and see if one of them is more appropriate for your purpose.
Upvotes: 1
Reputation: 2398
Java
holds extenced ASCII in accordance to UTF-8
and NOT to ISO 8859-1
and according to UTF-8
-- € i.e. EURO SIGN
stands for 8364
For more reference on this : UTF-8 Currency Symbols
Upvotes: 0