Morvael
Morvael

Reputation: 3567

Convert from hex string (from UCS-2) into UTF-8

I'm using a third party SMS provider and have hit an issue with converting from UCS-2 messages back into readable text.

Their API documentation has this code sample which converts UCS-2 messges into what I'm picking up on the API.

string message = "Это тестовое сообщение юникода";
byte[] ba = Encoding.BigEndianUnicode.GetBytes (message);
var hexString = BitConverter.ToString (ba);
Console.WriteLine ("@U" + hexString.Replace("-",""));

Which converts the message string into

@U042D0442043E00200442043504410442043E0432043E043500200441043E043E043104490435043D043804350020044E043D0438043A043E04340430

This looks like the UCS-2 messages I'm picking up from their API. Unfortunately they don't give any code samples of how to convert the messages back into a readable form.

I'm sure its not there in the docs because its something simple - but I just seem to figure out how to do it.

Upvotes: 1

Views: 632

Answers (2)

Mong Zhu
Mong Zhu

Reputation: 23732

looks like this would be the reverse:

string message = Encoding.BigEndianUnicode.GetString(ba);

The extraction of bytes could be done by such a method:

private IEnumerable<byte> GetTheBytes(string uc2Message)
{
    string bytesOnly = uc2Message.Trim('@', 'U');
    for (int i = 0; i < bytesOnly.Length-2; i+=2)
    {
        yield return Convert.ToByte($"{bytesOnly[i]}{bytesOnly[i+1]}", 16);
    }
}

Console.WriteLine(Encoding.BigEndianUnicode.GetString(GetTheBytes(uc2Message).ToArray()));

Upvotes: 1

TheGeneral
TheGeneral

Reputation: 81493

To reverse what you have (the string of hex prefixed with @U)

var message = "Это тестовое сообщение юникода";
var ba = Encoding.BigEndianUnicode.GetBytes(message);
var hexString = BitConverter.ToString(ba);
var encoded = "@U" + hexString.Replace("-", "");
Console.WriteLine(encoded);

// reverse
var bytes = Enumerable.Range(2, encoded.Length-2)
   .Where(x => x % 2 == 0)
   .Select(x => Convert.ToByte(encoded.Substring(x, 2), 16))
   .ToArray();

var result = Encoding.BigEndianUnicode.GetString(bytes);
Console.WriteLine(result);

Output

@U042D0442043E00200442043504410442043E0432043E043500200441043E043E043104490435043D043804350020044E043D0438043A043E04340430
Это тестовое сообщение юникода

Demo here

Upvotes: 3

Related Questions