Reputation: 7541
I'm creating a binary file to transmit to a third party that contains images and information about each image. The file uses a record length format, so each record is a particular length. The beginning of each record is the Record Length Indicator, which is 4 characters long and represents the length of the record in Big Endian format.
I'm using a BinaryWriter to write to the file, and for the Record Length Indicator I'm using Encoding.Default.
The problem I'm having is that there is one character in one record that is displaying as a "?" because it is unrecognized. My algorithm to build the string for the record length indicator is this:
private string toBigEndian(int value)
{
string returnValue = "";
string binary = Convert.ToString(value, 2).PadLeft(32, '0');
List<int> binaryBlocks = new List<int>();
binaryBlocks.Add(Convert.ToInt32(binary.Substring(0, 8), 2));
binaryBlocks.Add(Convert.ToInt32(binary.Substring(8, 8), 2));
binaryBlocks.Add(Convert.ToInt32(binary.Substring(16, 8), 2));
binaryBlocks.Add(Convert.ToInt32(binary.Substring(24, 8), 2));
foreach (int block in binaryBlocks)
{
returnValue += (char)block;
}
Console.WriteLine(value);
return returnValue;
}
It takes the length of the record, converts it to 32-bit binary, converts that to chunks of 8-bit binary, and then converts each chunk to its appropriate character. The string that is returned here does contain the correct characters, but when it's written to the file, one character is unrecognized. This is how I'm writing it:
//fileWriter is BinaryWriter and record is Encoding.Default
fileWriter.Write(record.GetBytes(toBigEndian(length)));
Perhaps I'm using the wrong type of encoding? I've tried UTF-8, which should work, but it gives me extra characters sometimes.
Thanks in advance for your help.
Upvotes: 2
Views: 1889
Reputation: 700720
The problem is that you should not return the value as a string at all.
When you cast the value to a char, and then encode it as 8 bit characters, there are several values that will be encoded into the wrong byte code, and several values that will fail to be encoded at all (resulting in the ? characters). The only way not to lose data in that step would be to encode it as UTF-16, but that would give you eight bytes instead of four.
You should return is as a byte array, so that you can write it to the file without converting it back and forth between character data and binary data.
private byte[] toBigEndian(int value) {
byte[] result = BitConverter.GetBytes(value);
if (BitConverter.IsLittleEndian) Array.Reverse(result);
return result;
}
fileWriter.Write(toBigEndian(length));
Upvotes: 6
Reputation: 181
Do not create a string from a int to write bytes. Better try this:
byte[] result =
{
(byte)( value >> 24 ),
(byte)( value >> 16 ),
(byte)( value >> 8 ) ,
(byte)( value >> 0 )
};
Upvotes: 1
Reputation: 294407
To read/write bits from binary streams with appropriate endianess use the BitConverter class, since it has explicit support for endianess: http://msdn.microsoft.com/en-us/library/system.bitconverter.islittleendian.aspx
Converting to binary then tokenizing into bytes is, I must say, the most unorthodox way I see yet :)
Upvotes: 0
Reputation: 11608
If you really want a binary four bytes (i.e. not just four characters, but a big-endian 32-bit length value) then you want something like this:
byte[] bytes = new byte[4];
bytes[3] = (byte)((value >> 24) & 0xff);
bytes[2] = (byte)((value >> 16) & 0xff);
bytes[1] = (byte)((value >> 8) & 0xff);
bytes[0] = (byte)(value & 0xff);
fileWriter.Write(bytes);
Upvotes: 1