I find the behavior of the the following TypeScript snippet inconsistent. Am I missing something?

Question

Below, assigning a string literal to a variable of basic (primitive) type string is fine: let s3: string = "s".

But shouldn't TypeScript disallow the assignment from a string literal to variable of non-basic String type: let s1: String = "s"? Especially, given that later s1 instanceof String is false:

TS Playground link.

let s1: String = "s"; // no error here
let s2: String = new String("s"); 
let s3: string = "s";
console.log(s1 instanceof String) // false
console.log(s2 instanceof String) // true
console.log(s1 === s2); // false
console.log(s1 === s3); // true;
console.log(typeof(s1)) // "string"
console.log(typeof(s2)) // "object"
console.log(typeof(s3)) // "string"
//console.log(s3 instanceof string) // error

The generated JS code is this (with TS 4.0, -t ESNext t.ts):

let s1 = "s"; // no error here
let s2 = new String("s");
let s3 = "s";
console.log(s1 instanceof String); // false
console.log(s2 instanceof String); // true
console.log(s1 === s2); // false
console.log(s1 === s3); // true;
console.log(typeof (s1)); // "string"
console.log(typeof (s2)); // "object"
console.log(typeof (s3)); // "string"
//console.log(s3 instanceof String) // error

I understand how things work in this JavaScript code, but why does TS default to generate it like that: a primitive value rather than an implicit instance of String. I'd rather expect let s1 = new String("s"), or an error.

I mean, if the variable v is of non-basic type Type in TypeScript, I'd expect v instanceof Type be true, but that's not the case for s1.

Is this behavior defined in the specs somewhere?

jcalz · Accepted Answer

First: one thing TypeScript cannot and will not do is take code like let s1: String = "s"; and emit it to JavaScript as let s1 = new String("s");. That's because TypeScript's type system is erased when TypeScript is compiled to JavaScript. TypeScript's emitter will strip off any type-system-specific features like type annotations. If what is left is valid JavaScript in the target version of JavaScript, that will be emitted as-is. There really is no other option but let s1 = "s"; for the emitted JavaScript.

It is specifically a non-goal of the TypeScript language design to "add or rely on run-time type information in programs, or emit different code based on the results of the type system". So any suggestion or expectation of this sort should be abandoned.

Now: why do they allow you to assign a string value to a variable of String type? This is the subject of microsoft/TypeScript#3448 (although the opener of that issue was arguing that types like string and String should be mutually assignable, whereas you are suggesting that they should be mutually incompatible... but the same topics arise, so see that issue for more info).

Currently, primitive types like string are considered assignable to the interfaces defined for their wrapper object types like String, but not vice versa. That is, in TypeScript, string is a (proper) subtype of String. This is all working as intended, although at least one of the language designers described the situation as a landmine.

So why is the above behavior (where string is assignable to String) working as intended? To try to make some sense of it, let's talk about a few usually-desirable features of TypeScript that combine to give this somewhat unfortunate behavior.

The first is the relationship between named class constructor values and the interface type of instances they construct. Let's say we have a class constructor named Foo; this will exist at runtime (either as an explicit ES2015 class or as an ES5 function) and we can use it to construct instances (e.g., new Foo("someArg");) and test against instances (e.g., val instanceof Foo). Then, generally speaking, there will be an interface in TypeScript's static type system, also named Foo, corresponding to the type of the instances created by the constructor. So if we call const foo = new Foo("someArg"); then when we inspect foo in our TypeScript IDE it will probably show us const foo: Foo;.

This same-name relationship happens automatically when we write a class in TypeScript. For library declarations not using the class syntax, this is also the usual convention; there will be a named interface for the constructor called something like FooConstructor with a newable signature like { new (arg: string) => Foo }, and then the constructor value will be declared like declare var Foo: FooConstructor; (see this part of the handbook for more info)... which amounts to the same thing: Foo is the name of a constructor whose instances are of type Foo.

This convention is followed for String (and Number and Boolean). There is a StringConstructor interface and String is declared to be a value of that type. And when we call new String() on a StringConstructor we get a value assignable to the String interface.

The next thing is that TypeScript's type system is structural and not nominal. If type A and type B have the same shape (i.e., their properties and methods have the same names and the same types), then TypeScript considers them the same type. Even if interface A { } and interface B { } are declared in two different places and do not mention each other, they can still be the same type.

This combined with the prior feature leads to some weirdness with the behavior of instanceof:

class Foo {
    x: string;
    constructor(x: string) {
        this.x = x;
    }
}

const foo: Foo = new Foo("x");
console.log(foo instanceof Foo); // true

const bar: Foo = { x: "x" }; // also accepted
console.log(bar instanceof Foo); // false

I'm allowed to say that bar is of type Foo, because the Foo interface only cares that it has a property named x of type string. There's no requirement that a value of type Foo be constructed by the Foo constructor. So val instanceof Foo's behavior can't be cleanly represented in TypeScript. Mismatches of this sort, where JavaScript cares about a value's provenance but TypeScript does not, are unfortunate but not easily avoidable because of TypeScript's reliance on structural compatibility.

Finally, primitives in JavaScript are wrapped with wrapper objects of the corresponding types whenever you try to look at properties or call methods on them. This gives primitives like string the appearance of being objects with an interface like String. And so, TypeScript allows you to treat a string like a String, by giving it apparent members from the String interface. This is why you can write "foo".toUpperCase() without a warning in TypeScript.

Put all these three together and you end up with the mess from your question. TypeScript's String interface just means "this thing has the same properties and methods of a String object" and not "this thing was produced by calling new on the String constructor". Nothing prevents you from writing const str: String = "x";. But of course typeof str === typeof new String() will be false, and any other test whose behavior depends on the difference between a primitive and its wrapper object will be unobservable in TypeScript. It's a consequence of a few different useful features of TypeScript interacting in unpleasant ways.

It is therefore recommended never to use the wrapper object types. If you write the type String in your code, it's probably not what you want, so don't do it. Such advice may be a poor substitute for compiler warnings, but right now that seems to be the best that can be done.

Playground link to code

I find the behavior of the the following TypeScript snippet inconsistent. Am I missing something?

Answers (2)

Related Questions