Anna
Anna

Reputation: 61

Decoding a string c#

I created TCP server that is distributing client's messages and run on a problem. When I'm sending Cyrillic messages through stream they're not decoding properly. Anyone knows how can I repair that?

Here's the code for sending the message:

var message = Console.ReadLine().ToCharArray().Select(x => (byte)x).ToArray();
stream.Write(message);`

Here's the code for receiving:

var numberOfBytes = stream.Read(buffer,0,1024);
Console.WriteLine($"{numberOfBytes} bytes received");
var chars = buffer.Select(x=>(char)x).ToArray();
var message = new string(chars);

Upvotes: 2

Views: 25614

Answers (2)

Cepheus
Cepheus

Reputation: 589

The problem is that a character in C# represents a 2-byte UTF-16 character. A cyrillic character is bigger than 255 in UTF-16, so you lose information when converting it to a byte.

To convert a string to a byte array, use the Encoding class:

byte[] buffer = System.Text.Encoding.UTF8.GetBytes(Console.ReadLine());

To convert it back to a string on the receiver's end, write:

string message = System.Text.Encoding.UTF8.GetString(buffer);

Another problem is that Stream.Read does not guarantee to read all bytes of your message at once (Your stream does not know that you send packets with a certain size). So it could happen, for example, that the last byte of the received byte array is only the first byte of a 2-byte character, and you receive the other byte the next time you call Stream.Read.

There are several solutions to this issue:

  1. Wrap the Stream in a StreamWriter at the sender's end and in a StreamReader at the receiver's end. This is probably the simplest method if you transmit only text.
  2. Transmit the length of your message at the beginning of your message as an integer. This number tells the receiver how many bytes he has to read.

Upvotes: 3

codeape
codeape

Reputation: 100786

To convert a string to bytes, use System.Text.Encoding.GetBytes(string). I suggest you change the sending code to:

// using System.Text;
var messageAsBytes = Encoding.UTF8.GetBytes(Console.ReadLine());

To convert bytes to a string, use System.Text.Encoding.GetString(byte[]). If you receive UTF-8-encoded bytes:

// using System.Text;
var messageAsString = Encoding.UTF8.GetString(buffer);

Some suggested reading:

Upvotes: 0

Related Questions