AMH
AMH

Reputation: 6451

Convert from string to byte strange behavior

I have string like this "0100110011001" I want to convert it to byte array such that the array contains zeros and ones the problem that after the conversion the array contains 49, 48 I don't why I tried many encoding for example I use the following code , and changed the encoding type

 System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
            byte result = encoding.GetBytes(str);

any idea why that happen, and how to achieve the output I desire

Upvotes: 3

Views: 880

Answers (3)

Huusom
Huusom

Reputation: 5912

As a one line linq statement (not that i would recommend this solution).

public static byte[] ToByteArray(this string source)
{
    return
        Regex.Matches(source.PadLeft(source.Length + source.Length % 8, '0'), "[01]{0,8}")
        .Cast<Match>()
        .Where(m => m.Success && !String.IsNullOrWhiteSpace(m.Groups[0].Value))
        .Select(m => Convert.ToByte(m.Groups[0].Value, 2))
        .ToArray();
}

Upvotes: 2

Michal B.
Michal B.

Reputation: 5719

48 is ASCII code for 0 and 49 is ASCII code for 1. There are many ways you can perform the conversion of this string, but this should be enough for you to manage on your own. Good luck :)

Possible solution:

    public static class StringExtensions
    {
        public static byte[] ToByteArray(this string str)
        {
            char[] arr = str.ToCharArray();
            byte[] byteArr = new byte[arr.Length];

            for (int i=0; i<arr.Length; ++i)
            {
                switch (arr[i])
                {
                    case '0': byteArr[i] = 0; break;
                    case '1': byteArr[i] = 1; break;
                    default: throw new Exception(arr[i]+" is not 0 or 1.");
                }
            }

            return byteArr;
        }
    }

Upvotes: 2

Jon Skeet
Jon Skeet

Reputation: 1500515

You're asking for the text of the characters '0' and '1' to be encoded using UTF-8. In UTF-8, a '0' is represented by byte 48, and '1' is represented by byte 49. (Non-ASCII characters are represented by multiple bytes.)

It sounds like you really want a binary parser - you can use Convert.ToByte(text, 2) for a single byte, but I'm not sure there's anything in the framework to convert an arbitrary-length string to a byte array by parsing it as binary. I'm sure there are lots of third-party routines available on the net to do it though - it's not hard.

It's very important that you understand why your original code didn't work though - what Encoding.GetBytes is really for.

Upvotes: 8

Related Questions