Reputation: 19
Kindly look at the following program:
static void Main()
{
string s1 = "Hello";
string s2 = "Hello";
Console.WriteLine ( ( object ) s1 == ( object ) s2 );
Console.ReadLine();
}
The output of this snippet is "TRUE". Now my question is:
does string s1 = "HELLO" ;
create a new string object? If yes, how does it create a new object without calling the constructor and without using the new operator??
If string s1 = "HELLO"
, and string s2 = "HELLO"
create two objects, then how come the answer is TRUE??
Upvotes: 1
Views: 189
Reputation: 149538
does string s1 = "HELLO" ; create a new string object? If yes, how does it create a new object without calling the constructor and without using the new operator??
Yes, not only does it create a new string but also bakes it into the libraries metadata under the "User Strings" section (This is otherwise called "string interning"), so it can directly pull it from there at run-time and save the allocation time. You can view it using ILDASM:
User Strings
-------------------------------------------------------
70000001 : ( 5) L"Hello"
And also see the compiler recognize it as a StringLiteralToken
when it parses the syntax tree:
The compiler is aware of the special syntax given for strings and allows you the special syntactic sugar.
If string s1 = "HELLO", and string s2 = "HELLO" create two objects, then how come the answer is TRUE??
As I previously said in the first part, the string literal is actually only loaded at run-time. This means that string will be loaded once, cached and compared against itself, thus leading this reference equality check to yield true.
You can see this in the emitted IL (Compiled in Release mode):
IL_0000: ldstr "Hello"
IL_0005: ldstr "Hello"
IL_000A: stloc.0 // s2
IL_000B: ldloc.0 // s2
IL_000C: ceq
Upvotes: 2
Reputation: 109557
If you intend to compare object references, it's clearer do it like so:
Console.WriteLine ( object.ReferenceEquals(s1, s2 ));
rather than like this:
Console.WriteLine ( ( object ) s1 == ( object ) s3 ); // false
That said, let's rewrite your code a little:
using System;
public class Program
{
public static void Main()
{
string s1 = "Hello";
string s2 = string2();
Console.WriteLine ( object.ReferenceEquals(s1, s2 )); // true
string s3 = "Hel";
s3 = s3 + "lo";
Console.WriteLine ( object.ReferenceEquals(s1, s3 )); // false
// This is the equivalent of the line above:
Console.WriteLine ( ( object ) s1 == ( object ) s3 ); // also false
Console.WriteLine (s1 == s3); // true (comparing string contents)
s3 = string.Intern(s3);
Console.WriteLine ( object.ReferenceEquals(s1, s3 )); // now true
Console.ReadLine();
}
private static string string2()
{
return "Hello";
}
}
Ok, so the question is, "Why do the first two strings have the same reference"?
The answer to that is because the compiler keeps a table of all the strings that it has stored so far, and if a new string it encounters is already in that table, it doesn't store a new one; instead, it makes the new string reference the corresponding string that is already in its table. This is called string interning
.
The next thing to note is that if you create a new string by concatenating two strings at runtime, then that new string does NOT have the same reference as an existing string. A brand new string is created.
However if you use ==
to compare that string with another string that has a different reference but the same contents, true
will be returned. That's because string ==
compares the contents of the string.
The following line in the above code demonstrates this:
Console.WriteLine (s1, s3); // true
Finally, note that the runtime can "intern" strings, that is, use a reference to an existing string rather than a new string. However, it does not do this automatically.
You can call string.Intern()
to explicitly intern a string, as the code above shows.
Upvotes: 4