radders
radders

Reputation: 923

String constructor

We can say,

string myString = "Hello";

Which 'magically' constructs a new string object holding that value.

Why can't a similar 'construction-less' approach be used for objects created from classes we define in our code? What's the 'magic' that VS does for strings? And for enums?

I've never seen an explanation of how this works.

Upvotes: 15

Views: 1609

Answers (4)

supercat
supercat

Reputation: 81105

Although the actual mechanics differ slightly from what I'll describe here, it's important to realize a string is not created when the code string myString = "Hello"; is executed. Rather, the string is created when the code is loaded.

The code for each assembly contains a big blob of binary data data is read into an array along with the code. If the code contains 23 different string literals, then the contents of all those literals will appear in the array along with 23 entries, each of which lists the starting index and length one of the strings. The process is conceptually similar to:

char[] RawData;  // Gets loaded by the runtime
string [] StringLiterals;

void create_strings()
{
  int numStrings = (int)RawData[0] + 65536*(int)RawData[1];
  StringLiterals= new string[numStrings];
  for (int i=0; i<numStrings; i++)
  {
    int header = i*4+2;
    int startLoc = (int)RawData[header] + 65536*(int)RawData[header+1];
    int length  = (int)RawData[header+2] + 65536*(int)RawData[header+3];
    StringsLiterals[i] = new String(RawData, startOfs, length);
  }
}

If "Hello" happens to be the 7th string defined in an assembly, then the characters "Hello" would appear in the RawData at the position defined by entry #7. The aforementioned statement would then be translated as string myString = StringLiterals[7];--not creating a new object, but simply returning a reference to an object which was created when the class was loaded.

Upvotes: 1

jakobbotsch
jakobbotsch

Reputation: 6337

The C# compiler turns this into the corresponding CIL instruction: ldstr. There is no equivalent for your own complex type, so the compiler must emit a newobj CIL instruction, which calls the constructor of your type. The syntax you suggest would hide this constructor call from the user.

Upvotes: 5

Anton Kryvenko
Anton Kryvenko

Reputation: 175

In fact, you can do so for your custom classes. It's achieved by defining your own implicit conversions from other types. It's covered very well in msdn: http://msdn.microsoft.com/en-us/library/aa288476%28v=vs.71%29.aspx

Here's a example modified for string:

class Email
{
    private string user;
    private string domain;

    public Email(string user, string domain)
    {
        this.user = user;
        this.domain = domain;
    }

    static public implicit operator Email(string value) // magic goes here ;)
    {
        var parts = value.Split('@');
        if (parts.Length != 2)
            return null;

        return new Email(parts[0], parts[1]);
    }

    static public implicit operator string(Email value)
    {
        return "{ User = " + value.user + ", Domain = " + value.domain + " }";
    }
}

class Test
{
    static public void Main()
    {
        Email test = "[email protected]"

        System.Console.WriteLine("Test: " + test);
    }
}

Upvotes: 7

Jon Skeet
Jon Skeet

Reputation: 1500055

Basically, it's part of the C# language specification: there's syntax for string literals, numeric literals, character literals and Boolean literals, but that's all.

The compiler uses these literals to generate IL, and for most of them, there's a suitable instruction for "constant of a particular type", so it's directly represented. One exception to this is decimal, which is not a primitive type in terms of the CLR, and so has to have extra support. (That's why you can't specify a decimal argument when applying an attribute, for example.)

The simplest way to see what happens is to use ildasm (or a similar tool) to look at the IL generated for any specific bit of source code.

In terms of creating your own classes - you could provide an implicit conversion from string (or something else) to your own type, but that wouldn't have quite the same effect. You could write source code of:

MyType x = "hello";

... but that wouldn't be a "constant" of type MyType... it would just be an initializer which happened to use your implicit conversion.

Upvotes: 25

Related Questions