Reputation: 30185
From Richter and this discussion, I would expect any two "identical" strings to be the same reference. But just now in LINQPad I got mixed results on this topic. Here is the code:
void Main()
{
string alpha = String.Format("Hello{0}", 5);
string brava = String.Format("Hello{0}", 5);
ReferenceEquals(alpha, brava).Dump();
String.IsInterned(alpha).Dump();
String.IsInterned(brava).Dump();
alpha = "hello";
brava = "hello";
ReferenceEquals(alpha, brava).Dump();
}
And here are the results from the Dump() calls:
False
Hello5
Hello5
True
I would have expected both the first and last ReferenceEquals
to be True
. What happened?
Besides the example above, in what other cases would the ReferenceEquals fail? For example, multi-threading?
This issue is important if, for example, I'm using string parameters passed into a method as the object upon which a lock is taken. The references better be the same in that case!!!
Upvotes: 3
Views: 417
Reputation: 180808
This blog entry explains why.
In short, if your string isn't allocated through ldstr
(i.e. it's not a string literal defined within your code), it doesn't end up in the (hash) table of interned strings, and therefore interning doesn't occur.
The solution is to call String.Intern(str)
. The Intern method uses the intern pool to search for a string equal to the value of str
. If such a string exists, its reference in the intern pool is returned. If the string does not exist, a reference to str is added to the intern pool, then that reference is returned.
Don't lock on strings, especially if you're attempting to use two different reference variables to attempt to point to the same (possibly) interned string.
Also note that there are some disadvantages to interning strings. Because string literals are not expected to change during the program's lifetime, interned strings are not garbage collected until your program exits.
Upvotes: 2
Reputation: 941724
Make the second case more interesting with:
alpha = "hello";
brava = "hell" + "o";
ReferenceEquals(alpha, brava).Dump();
The implementation is pretty straight-forward. Somebody has to make the effort to recognize that a particular string matches another instance of a string. That takes time, inevitably. Time is in short supply at runtime, string processing needs to be fast. But the compiler has its sweet time finding matches, it can build a hash table with string literals. So the basic rule is that only compile-time constant string expressions will be interned.
Upvotes: 1
Reputation: 393174
String interning is not guaranteed to happen. This should never be relied on.
Your last comparison yields True
. This is not because of 'casual' interning happening, but due the fact that both strings are initialized from the identical string literal "hello"
. In that particular case, they will have been interned. This was explained in the linked answer by Svick.
There is also no real need to.
Use String.Equals
to compare strings.
You need a separate lock variable. The usual pattern involves
private /*readonly*/ object lockObject = new object();
inside the scope that contains the object (string, in this case) it is supposed to guard. This is the only way in which it can work robust in case the refence gets changed.
Upvotes: 3
Reputation: 12741
String interning does not occur on dynamically created strings. This includes those created by String.Format and StringBuilder (I believe String.Format uses a StringBuilder internally). MSDN's documentation for String.Intern indicates this:
In the following example, the string s1, which has a value of "MyTest", is already interned because it is a literal in the program. The System.Text.StringBuilder class generates a new string object that has the same value as s1. A reference to that string is assigned to s2. The Intern method searches for a string that has the same value as s2. Because such a string exists, the method returns the same reference that is assigned to s1. That reference is then assigned to s3. References s1 and s2 compare unequal because they refer to different objects; references s1 and s3 compare equal because they refer to the same string.
string s1 = "MyTest"; string s2 = new StringBuilder().Append("My").Append("Test").ToString(); string s3 = String.Intern(s2); Console.WriteLine((Object)s2==(Object)s1); //Different references. Console.WriteLine((Object)s3==(Object)s1); //The same reference.
The key thing to note is that to the CLR your string produced by string.Format("Hello{0}", 5)
is not viewed as a literal string so interning does not occur when the assembly is loaded. The strings "hello"
on the other hand are interned by the CLR. In order to intern these strings you would have to explicitly do so with String.Intern.
Edit
In regard to your locking question, you could in theory use strings as your locking object but I would regard this as bad practice. You have no idea where the strings passed to your application came from so there is no guarantee they are the same references. The strings could have came from a Database read call, using StringBuilder, using String.Format, or user input. In these cases your locking would not ensure that only one thread is in your critical section at a time since string interning is not guaranteed to have happened.
Even if you could guarantee that you were always using interned strings you would still have potentially dangerous problems. Now anyone could lock on that same string reference anywhere in your application (including other AppDomains). This is bad news.
I would recommend having an explicitly declared lock object (of type object). You will save yourself a ton of time in debugging threading issues if they arise.
Upvotes: 3
Reputation: 19203
To answer your original question, string interning is only for constant strings and when you explicitly ask for it, at least per that answer.
However if you are looking to guarentee that a particular string is interned, you can call string.Intern
.
string internedVersion = string.Intern("Some string");
In contrast string.IsInterned
returns an interned string if one exists. There is no guarantee that for a particular string
object that you have that interned string, unless you have called Intern
or IsInterned
and are using the return value of one of those methods.
This issue is important if, for example, I'm using string parameters passed into a method as the object upon which a lock is taken. The references better be the same in that case!!!
You should never lock on a string object, you have no idea (due to interning) what is locking on those strings.
If you need to lock on strings, I recommend the following instead:
Dictionary<string,object> locks;
locks.Add("TEST", new object());
lock (locks["TEST"])
{
}
Note that thanks to the default equality provided by string, this logic will work with non-interned strings as well.
Alternatively you can make you own classes to wrap the strings and handle equality, but that is probably overkill for locks.
Upvotes: 2