Reputation: 11753
I red: How many bytes will a string take up? and How to know the size of the string in bytes? and some others but I can't figure out the exact count of bytes that a string will take into memory using a BinaryWriter over a MemoryMappedViewStream over a MemoryMappedFile.
Sometimes the lenght taken is the string lenght + 1, sometimes it is the string lenght + 2 ???
I tried both:
But none of them works. I tried the string lenght plus a fixed amount but it does not works either.
If I check the difference between the BinaryWriter.BaseStream.Position before and after, then I can't figure out a way to determine what will be exact amount of bytes written for a string (position after - position before). It sounds like there is alignment or something else I can't figure out?
How to have the proper amount of bytes to write each time?
Update
I now use Encoding.UTF8.GetByteCount(str) + 1;
, which give me almost the right size most of the time but not always.
Upvotes: 0
Views: 197
Reputation: 11753
I found my answer based on the source code of Microsoft BinaryWriter at https://referencesource.microsoft.com/#mscorlib/system/io/binarywriter.cs,08f2e8c389fd32df
Note: The lenght of the string is written before the string and is encoded on 7 bits where the size could vary depending on the string size (>= 128, >=2^14, > 2^21).
Code:
public static int GetBinaryWriterSizeRequired(this string str, Encoding encoding)
{
encoding = encoding ?? Encoding.Default;
int byteCount = encoding.GetByteCount(str);
int byteCountRequiredToWriteTheSize = 1;
// EO: This code is based on the Microsoft Source Code of the BinaryWriter at:
// https://referencesource.microsoft.com/#mscorlib/system/io/binarywriter.cs,2daa1d14ff1877bd
uint v = (uint)byteCount; // support negative numbers
while (v >= 0x80)
{
v >>= 7;
byteCountRequiredToWriteTheSize++;
}
return byteCountRequiredToWriteTheSize + byteCount;
}
The call:
...
_writer = new BinaryWriter(_stream);
_writerEncoding = _writer.GetPrivateFieldValue<Encoding>("_encoding");
...
int sizeRequired = name.GetBinaryWriterSizeRequired(_writerEncoding);
Others (I know we should not call private fields but I did it):
public static T GetPrivateFieldValue<T>(this object obj, string propName)
{
if (obj == null)
throw new ArgumentNullException("obj");
Type t = obj.GetType();
FieldInfo fi = null;
while (fi == null && t != null)
{
fi = t.GetField(propName, BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance);
t = t.BaseType;
}
if (fi == null)
throw new ArgumentOutOfRangeException("propName", string.Format("Field {0} was not found in Type {1}", propName, obj.GetType().FullName));
return (T)fi.GetValue(obj);
}
Just as reference, those are all bad:
return Encoding.UTF8.GetByteCount(str) + 1;
return System.Text.ASCIIEncoding.Default.GetByteCount(str) + 1; // sizeof(int); // sizeof int to keep the size
return System.Text.ASCIIEncoding.Unicode.GetByteCount(str) + sizeof(int); // sizeof int to keep the size
Upvotes: 0