Reputation: 9380
Hello i want to hard code the values of some utf 8 characters to bytes.
E.g: '$'
,'-'
,'+'
;
For '$'
how is the byte value calculated from this :
symbol char octal code point binary code point binary utf8
$ U+0024 044 010 0100 00100100
What is the value from this columns that gets encoded to byte?
public class Constants{
public const byte dollar= [value pick from where ?]
public const byte minus= [pick value from where?]
}
Which column from above should i look for to encode a byte?
Is there any formula between the char
column value and the byte value?
Upvotes: 0
Views: 1181
Reputation: 111950
For ASCII chars (so chars in the range 0-127), you can simply cast them
public const byte dollar = (byte)'?';
Otherwise:
public const byte dollar = 0x0024;
So the char
column. Remove the U+
and add a 0x. Valid only for characters in the range 0x0000-0x007F.
Note that there is no difference in the compiled code: sharplab:
public const byte dollar = (byte)'$';
public const byte dollar2 = 0x0024;
gets compiled to:
.field public static literal uint8 dollar = uint8(36)
.field public static literal uint8 dollar2 = uint8(36)
With C# 7.0, if you hate the world and you want to obfuscate your code, you can:
public const byte dollar = 0b00100100;
(they added binary literals, 0b
is the prefix)
Upvotes: 1
Reputation: 157146
The characters you refer to are not UTF-8 characters. So they are single-byte characters. (Note that UTF-8 only uses 2 bytes for characters outside the ASCII character set)
Since the above, you can just cast them:
public const byte dollar = (byte)'$';
If you would need a UTF-8 character in bytes, you should use:
public static readonly byte[] trademark = new byte[] { 194, 153 };
Or, more explicit, but also worst for performance:
public static readonly byte[] trademark = Encoding.UTF8.GetBytes("\u0099");
Upvotes: 1