Reputation: 61
I created TCP server that is distributing client's messages and run on a problem. When I'm sending Cyrillic messages through stream they're not decoding properly. Anyone knows how can I repair that?
Here's the code for sending the message:
var message = Console.ReadLine().ToCharArray().Select(x => (byte)x).ToArray();
stream.Write(message);`
Here's the code for receiving:
var numberOfBytes = stream.Read(buffer,0,1024);
Console.WriteLine($"{numberOfBytes} bytes received");
var chars = buffer.Select(x=>(char)x).ToArray();
var message = new string(chars);
Upvotes: 2
Views: 25614
Reputation: 589
The problem is that a character in C# represents a 2-byte UTF-16 character. A cyrillic character is bigger than 255 in UTF-16, so you lose information when converting it to a byte.
To convert a string to a byte array, use the Encoding class:
byte[] buffer = System.Text.Encoding.UTF8.GetBytes(Console.ReadLine());
To convert it back to a string on the receiver's end, write:
string message = System.Text.Encoding.UTF8.GetString(buffer);
Another problem is that Stream.Read does not guarantee to read all bytes of your message at once (Your stream does not know that you send packets with a certain size). So it could happen, for example, that the last byte of the received byte array is only the first byte of a 2-byte character, and you receive the other byte the next time you call Stream.Read.
There are several solutions to this issue:
Upvotes: 3
Reputation: 100786
To convert a string to bytes, use System.Text.Encoding.GetBytes(string)
. I suggest you change the sending code to:
// using System.Text;
var messageAsBytes = Encoding.UTF8.GetBytes(Console.ReadLine());
To convert bytes to a string, use System.Text.Encoding.GetString(byte[])
. If you receive UTF-8-encoded bytes:
// using System.Text;
var messageAsString = Encoding.UTF8.GetString(buffer);
Some suggested reading:
Upvotes: 0