Reputation: 21999
An example:
var a = $"Some value 1: {b1:0.00}\nSome value 2: {b2}\nSome value 3: {b3:0.00000}\nSome value 4: {b4:0.00}\nSome value 5: {b6:0.0}\nSome value 7: {b7:0.000000000}";
That's somewhat hard to read source.
I can do it
var a = $"Some value 1: {b1:0.00}\n" +
$"Some value 2: {b2}\n" +
$"Some value 3: {b3:0.00000}\n" +
$"Some value 4: {b4:0.00}\n" +
$"Some value 5: {b6:0.0}\n" +
$"Some value 7: {b7:0.000000000}";
But here is a comment saying what this will be multiple calls to string.Format
and I think it will (no idea how to check it, IL is a black box for me yet).
Question: is it ok to do? What are other options to split long interpolated string?
Upvotes: 0
Views: 448
Reputation: 31116
What does the compiler do?
Let's start here:
var a = $"Some value 1: {b1:0.00}\n" +
$"Some value 2: {b2}\n" +
$"Some value 3: {b3:0.00000}\n" +
$"Some value 4: {b4:0.00}\n" +
$"Some value 5: {b6:0.0}\n" +
$"Some value 7: {b7:0.000000000}";
IL is a black box for me yet
Why not simply Open it up? That's pretty easy using a tool like ILSpy, Reflector, etc.
What will happen in your code is that each line is compiled to a string.Format
. The rule is pretty simple: if you have $"...{X}...{Y}..."
it will be compiled as string.Format("...{0}...{1}...", X, Y)
. Also the +
operator will introduce a string concatenation.
In more detail, string.Format
is a simple static call, which means that the compiler will use the call
opcode instead of callvirt
.
From all this you might deduce that it's pretty easy for a compiler to optimize this: if we have an expression like constant string + constant string + ...
you can simply replace it with constant string
. You can argue that the compiler has knowledge about the inner workings of string.Format
and string concatenation and handle that. On the other hand, you could argue that it should not. Let me detail the two considerations:
Note that strings are objects in .NET, but they are 'special ones'. You can see this from the fact that there's a special ldstr
opcode, but also if you check out what happens if you switch
on a string -- the compiler will generate a dictionary. So, from this you could deduce that the compiler 'knows' how a string
works internally. Let's figure out if it knows how to do concatenation, ok?
var str = "foo" + "bar";
Console.WriteLine(str);
In IL (Release mode of course) this will give:
L_0000: ldstr "foobar"
tl;dr: So, regardless if the concatenation of interpolated strings are already implemented or not (they are not), I'd be pretty confident that the compiler will handle this case eventually.
What does the JIT do?
Next question would be: how smart is the JIT compiler with strings?
So, let's consider for a moment that we will teach the compiler about all the inner workings of string
. First we should note that C# is compiled to IL, which is JIT compiled to assembler. In the case of the switch
it's pretty hard for the JIT compiler to create the dictionary, so we have to do it in the compiler. On the other hand, if we're handling more complex concatenation it makes sense to use the things we already have available for f.ex. integer arithmetic to do string operations as well. This implies putting string operations in the JIT compiler. Let's for a moment consider that with an example:
var str = "";
for (int i=0; i<10; ++i) {
str += "foo";
}
Console.WriteLine(str);
The compiler will simply compile the concatenation to IL, which means that the IL will hold a pretty straight-forward implementation of this. In this case loop unrolling arguably has a lot of benefits for the (runtime) performance of the program: it can simply unroll the loop, appending the string 10 times, which results in a simple constant.
However, giving this knowledge to the JIT compiler makes it more complex, which means that the runtime will spend more time on JIT compiling (figuring out the optimization) and less time executing (running the emitted assembler). Question that remains is: what will happen?
Start the program, put a breakpoint on the writeline and hit ctrl-alt-D and see the assembler.
00007FFCC8044413 jmp 00007FFCC804443F
{
str += "foo";
00007FFCC8044415 mov rdx,2BEE2093610h
00007FFCC804441F mov rdx,qword ptr [rdx]
00007FFCC8044422 mov rcx,qword ptr [rbp-18h]
00007FFCC8044426 call 00007FFD26434CC0
[...]
00007FFCC804443A inc eax
00007FFCC804443C mov dword ptr [rbp-0Ch],eax
00007FFCC804443F mov ecx,dword ptr [rbp-0Ch]
00007FFCC8044442 cmp ecx,0Ah
00007FFCC8044445 jl 00007FFCC8044415
tl;dr: Nope, that's not optimized.
But I want the JIT to optimize that as well!
Yea, well, I'm not too sure if I share that opinion. There's a balance between runtime performance and time spent in JIT compilation. Notice that if you're doing something like this in a tight loop, I would argue that you're asking for trouble. On the other hand, if it's a common and trivial case (like the constants that are concatenated) it's pretty easy to optimize and it doesn't affect the runtime.
In other words: arguably, you don't want this to be optimized by the JIT, assuming that would take too much time. I'm confident we can trust Microsoft in making this decision wisely.
Also, you should realize that strings in .NET are heavily optimized things. We all know that they're used a lot, and so does Microsoft. If you're not writing 'really stupid code', it's a very reasonable assumption that it will perform just fine (until proven otherwise).
Alternatives?
What are other options to split long interpolated string?
Use resources. Resources are a useful tool in dealing with multiple languages. And if this is just a small, non-professional project - I simply wouldn't bother at all.
Alternatively you can use the fact that constant strings are concatenated:
var fmt = "Some value 1: {1:0.00}\n" +
"Some value 2: {2}\n" +
"Some value 3: {3:0.00000}\n" +
"Some value 4: {4:0.00}\n" +
"Some value 5: {6:0.0}\n" +
"Some value 7: {7:0.000000000}";
var a = string.Format(fmt, b1, b2, b3, b4, b5, b6, b7);
Upvotes: 3
Reputation: 660038
what this will be multiple calls to string.Format and I think it will
You're right. You haven't said why you care. Why is that to be avoided?
is it ok to do?
It's fine by me.
What are other options to split long interpolated string?
I would use a verbatim interpolated string. That will solve your problem nicely.
See
How do you use verbatim strings with interpolation?
(Since that is the link you mentioned in the question I am not 100% clear on why you asked this question, since you already read a page that suggested a good answer.)
I don't like $@ idea, it makes it worse than long string
You might have said that earlier.
can't it be accidentally damaged by reformatting sources?
All code can be changed by changing the sources.
What are other options to split long interpolated string?
Don't interpolate in the first place. Make the string a resource, make a class responsible for fetching formatted resource strings, and hide the implementation details of how you format the string inside methods of the class.
Upvotes: 7