Gus
Gus

Reputation: 10659

C# string concatenation - in-line vs. line-by-line

C# code:

string first = "A"; 
first += "B"; 
first += "C";
string second = "D" + "E" + "F";


Generated IL code:

.locals init ([0] string first,
           [1] string second)
  IL_0000:  nop
  IL_0001:  ldstr      "A"
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldstr      "B"
  IL_000d:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_0012:  stloc.0
  IL_0013:  ldloc.0
  IL_0014:  ldstr      "C"
  IL_0019:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_001e:  stloc.0
  IL_001f:  ldstr      "DEF"
  IL_0024:  stloc.1
  IL_0025:  ret

It is obvious that the in-line concatenation is a bit more efficient because it calls ldstr only once, but are there any other differences (for instance string objects created in memory?)

Thanks

Upvotes: 1

Views: 1091

Answers (4)

Jon Hanna
Jon Hanna

Reputation: 113272

Yes, precisely as you say - string objects created in memory.

Try turning the IL back into C# by hand. The code is pretty much equivalent to:

string first = "A";
string __temp = "B";
first = string.Concat(first, __temp);
__temp = "C";
first = string.Concat(first, __temp);

second = "DEF";

Further advantages can come down the line. As it is the following strings will be in the intern pool: "A", "B", "C", "DEF". Just what's more advantageous here depends on a few things, but it's likely that having the strings that are actually used in the pool is best, and "A", "B" & "C" aren't used except to create "ABC". In real code though, those substrings would likely be more signficant.

Most importantly though, note that this isn't a fair comparison between in-line vs. line by line. Try the following:

using(TextReader tr = new StreamReader(someFile))//no way for compiler to know what this will contain
{
  string a = tr.ReadLine();
  string b = tr.ReadLine();
  string c = tr.ReadLine();
  string d = tr.ReadLine();
  string e = tr.ReadLine();
  string f = tr.ReadLine();
  string first = a;
  first += b;
  first += c;
  string second = d + e + f;
}

Because the strings aren't hard-coded literals, the available optimisations are different, so the comparison between the two will differ. Yet more differences will be the case in other situations.

Upvotes: 2

Alexei Levenkov
Alexei Levenkov

Reputation: 100547

For constants it is dup of C# Compile-Time Concatenation For String Constants (merges string constants).

Otherwise use StringBuilder and be happy.

Upvotes: 0

Eric J.
Eric J.

Reputation: 150108

In the line-by-line case, the compiler does not have an optimization to notice that line has a value that can be known at compile time at every point. In the single-line case, it does have that optimization.

I suspect only members of the C# compiler team can tell you with certainty why they did not optimize the first case, though I suspect it's due to the relative complexity of analyzing multiple lines of code compared to the relative benefit to C# programmers.

Upvotes: 2

Joe
Joe

Reputation: 82614

string second = "D" + "E" + "F"; 

got optimized to:

string second = "DEF";

Upvotes: 3

Related Questions