Reputation: 277
I am trying to get certain bytes to write on an Image, for example:
" །༉ᵒᵗᵗ͟ᵋༀ 🐢 ͟ ͟ ͟ ͟ ͟ ͟🐢 ͟ ͟ ͟ ͟ ͟ ͟ ͟🐛 ͟ ͟ ͟ ͟ ͟ ͟🐢. . . "
When I display it on an image but I am getting the following instead...
Image:
I have tried changing the Encoding Type of the string, when I receive the bytes and there is no set font but I have tried all default Microsoft fonts as well as a few custom ones I found on the Internet. What am I doing Wrong?
Edit: The original was using Graphics.DrawString
. I have tried TextRenderer
and it came out with almost the same results.
Image:
This is the code I'm using to generate the image:
string text = "[rotten4pple] །༉ᵒᵗᵗ͟ᵋༀ 🐢 ͟ ͟ ͟ ͟ ͟ ͟🐢 ͟ ͟ ͟ ͟ ͟ ͟ ͟🐛 ͟ ͟ ͟ ͟ ͟ ͟🐢. . .";
var font = new Font("Arial", 8, FontStyle.Regular);
var bitmap = new Bitmap(1, 1);
var size = Graphics.FromImage(bitmap).MeasureString(text, font);
bitmap = new Bitmap((int)size.Width + 4, (int)size.Height + 4);
using (var gfx = Graphics.FromImage(bitmap))
{
gfx.Clear(Color.White);
TextRenderer.DrawText(gfx, cmd.AllArguments, font, new Point(2, 2),
Color.Black, Color.White);
}
The variable cmd.AllArguments
is passed down into the method, I believe the string is Encoded using windows-1252
.
Upvotes: 5
Views: 853
Reputation: 40210
Don't use Graphics.DrawString
for unicode characters.
You should migrate to TextRenderer.DrawText
instead, for example:
TextRenderer.DrawText(e.Graphics, "こんにちは", this.Font,
new Point(10, 10), this.ForeColor, this.BackColor, flags);
The drawback is that you wont be able to specify a Brush
.
I have tested it. I think some else must be going on, because it seems to work for me. Here is my code:
private void Form1_Paint(object sender, PaintEventArgs e)
{
var text = " །༉ᵒᵗᵗ͟ᵋༀ 🐢 ͟ ͟ ͟ ͟ ͟ ͟🐢 ͟ ͟ ͟ ͟ ͟ ͟ ͟🐛 ͟ ͟ ͟ ͟ ͟ ͟🐢. . . ";
TextRenderer.DrawText(e.Graphics, "TextRenderer.DrawText" + text , this.Font,
new Point(10, 10), this.ForeColor, this.BackColor);
e.Graphics.DrawString("Graphics.DrawString" + text, this.Font,
new SolidBrush(this.ForeColor), new PointF(10, 30));
}
Note: Font is Arial Unicode MS 8.25pt
.
The output:
Here is the original string, stored in UTF-8:
[rotten4pple] །༉ᵒᵗᵗ͟ᵋༀ 🐢 ͟ ͟ ͟ ͟ ͟ ͟🐢 ͟ ͟ ͟ ͟ ͟ ͟ ͟🐛 ͟ ͟ ͟ ͟ ͟ ͟🐢. . .
And here is the wrong string you are getting, stored in Windows-1252:
[rotten4pple] à¼à¼‰áµ’ᵗᵗ͟ᵋༀ 🢠͟ ÍŸ ÍŸ ÍŸ ÍŸ ͟🢠͟ ÍŸ ÍŸ ÍŸ ÍŸ ÍŸ ͟🛠͟ ÍŸ ÍŸ ÍŸ ÍŸ ÍŸðŸ¢. . .
And they are binary equal. This is the hexadecimal representation of the bytes for both strings:
5B 72 6F 74 74 65 6E 34 70 70 6C 65 5D 20 E0 BC
8D E0 BC 89 E1 B5 92 E1 B5 97 E1 B5 97 CD 9F E1
B5 8B E0 BC 80 EF A3 BF 20 F0 9F 90 A2 20 CD 9F
20 CD 9F 20 CD 9F 20 CD 9F 20 CD 9F 20 CD 9F F0
9F 90 A2 20 CD 9F 20 CD 9F 20 CD 9F 20 CD 9F 20
CD 9F 20 CD 9F 20 CD 9F F0 9F 90 9B 20 CD 9F 20
CD 9F 20 CD 9F 20 CD 9F 20 CD 9F 20 CD 9F F0 9F
90 A2 2E 20 2E 20 2E
Since this is a re-interpretation of the binary values and not a re-encoding, converting from one to the other with Encoding.Convert
in .NET is not viable. Instead you should get the binary representation of the string in the wrong encoding and read it as the correct encoding directly:
var text = cmd.AllArguments;
var bytes = Encoding.GetEncoding(1252).GetBytes(text);
text = Encoding.UTF8.GetString(bytes);
Notes
You have been asking for what encoding uses the API you are using by default. I'm not familiar with the API you are using... yet, there is risk that it depends on the configuration of the machine. You should look for an overload that allows you to specify that you are receiving an UTF-8 string.
The chances are that you are actually receiving a byte[]
anyway, so you can use Encoding.UTF8.GetString
directly on it. If you cannot specify the encoding, you should consider switching to send byte[]
instead, the purpose of this is to have more control over the encoding.
On that regard, don't use Encoding.Default
because it will be Extended ASCII for the language of the machine.
By the way, UTF-8 is a good choice for networking, not only because it is independent of the language and other regional configuration, but also because it is independent of byte order (endianness).
Upvotes: 3