Reputation: 4286
Save that I have a (large) collection of instances of Dictionary. The key value in that dictionary is always one of say 10 known strings.
If the collection contains 1000000 entries, will the that string key value occupy memory for each instance and key? Is there any good way to optimize a case like that, perhaps using string interning?
Another way would be to use say a short for the key instead and translate between the string and the short but the syntax get's a bit messy...
Upvotes: 3
Views: 1561
Reputation: 81660
Usually not - they are stored as a single immutable variable. Strings can be interned which will help saving memory.
But this depends. If you construct the string every time (e.g. concatenation) they will not be interned. Defining them as constants ensures they will be interned.
You can check if two strings are the same in memory using object.ReferenceEquals()
.
Upvotes: 2
Reputation: 133950
As others have said, it depends on how you're getting the strings to put into your list. A couple of examples should help.
Imagine you have a text file that contains 1,000 lines, all the same. That is, a file has "hello" repeated 1,000 times:
hello
hello
hello
...
If you write a program to read that file into a List<string>
the naive way, then there will be 1,000 different string instances. That is:
var myList = new List<string>();
var reader = new StreamReader("filename");
string s;
while ((s = reader.ReadLine()) != null)
{
myList.Add(s); // each string is a unique instance
}
If memory is a concern, then what you want to do is maintain a lookup table that has string keys and string values. It's a mapping of strings to single values. So when you use a duplicate string, you get a reference to the first instance.
var KeyLookup = new Dictionary<string, string>();
string AddString(string key)
{
string value;
if (!KeyLookup.TryGetValue(key, out value))
{
value = key;
KeyLookup.Add(key, value);
}
return value;
}
And then when you read the file:
while ((s = reader.ReadLine()) != null)
{
myList.Add(AddString(s)); // duplicate strings use the same instance
}
In this case, there will be only one instance of the string "hello"
in the program.
You can do something similar with the keys in your lists. Create a lookup table for your keys and make sure that whenever you add a key to your list, you add the value from the lookup table rather than the key itself.
As others have pointed out, if your known keys are already constants and you always use the constant values when adding them to the lists, then the strings are already interned and the above isn't required.
Upvotes: 0
Reputation: 111810
Constant strings are interned (so string str = "hello";
is interned). Other strings normally aren't. You can force a string to be interned using the String.Intern
static method, but be sure to read the side effects on http://msdn.microsoft.com/en-us/library/system.string.intern.aspx . Remember that if you have a const hello
string and a dynamically built hello
string, only the first will be interned. Sometimes you can gain a little memory by interning often-used strings.
In your situation, if you are using only a little number of strings that are already memorized in another collection AND you copy these strings (var str2 = str1
), then you aren't duplicating the string, only creating another reference. BUT if you obtain the new string manipulating the old string (var str2 = ("Z" + str1).Substring(1)
) then you are really creating a new string instead of referencing the old one.
Upvotes: 1
Reputation: 941208
String is a reference type. The dictionary contains a reference to the actual string object, 4 bytes on a 32-bit operating system. Adding the same string to multiple dictionaries produces only one copy of the string.
You already got what you are looking for.
Upvotes: 1
Reputation: 18420
I think to save space or optimize it we can do one of these.
Create a 10 different List<T>
s' of int for each Item name (string). And then do search in all 10 lists for item when retrieving.
Or create a Dictionary of lists like this Dictonary<List<int>,string>
and store each key (in respective list) for each item name (string). Its almost same as above but allow you to add more items in future.
And i believe we will still get better performance
(However i would like other to comment on my assumption)
Also if you have got about 1,00,000 records you should better store it in a database and have two tables with One containing string and an ID for Item name and other containing key and Foreign key to Item ID.
Upvotes: 0