user430481
user430481

Reputation: 315

Why are my .NET strings so large in memory?

If I run

string myString = "*.txt";
Print("sizeof(char): " + sizeof(char) + " bytes");
Print("myString.Length * sizeof(char): " + (myString.Length * sizeof(char)) + " bytes");

It will print

sizeof(char): 2 bytes

myString.Length * sizeof(char): 10 bytes

But, if I run the code from the first answer to this question:

myString = "*.txt"
long size = 0;
using (Stream s = new MemoryStream())
{
    BinaryFormatter formatter = new BinaryFormatter();
    formatter.Serialize(s, myString);
    size = s.Length;
}
Print("myString Serialized Size: " + size + " bytes");

I get

myString Serialized Size: 29 bytes

Which of these is a more accurate representation of how much space my string is taking up in memory?

Upvotes: 0

Views: 256

Answers (1)

Marc Gravell
Marc Gravell

Reputation: 1062770

Asking about the size (bytes) of a string is complex;

  • internally, it will be UTF-16, so: twice as many characters (assuming it wasn't created over-sized, which is possible)
    • but the string object itself has the string length and the object overhead to consider, then there's "padding" etc
  • if you're talking about size in vanilla binary encodings, then you need to know what Encoding you're discussing; ASCII, UTF-8, UTF-16, etc - plus you need to know whether or not you're including a BOM
  • the one thing you would not do is run it through BinaryFormatter; BinaryFormatter is a general purpose serializer that includes type metadata, field names, etc; in general, you should almost never use BinaryFormatter ... for anything :)

So: the reason you're getting an unexpected answer is that you're asking the wrong question. For the "in memory" discussion, you're really after the first bullet. It isn't easy to give an exact answer because the size of the object overhead depends on your target platform.

Upvotes: 3

Related Questions