Shawn
Shawn

Reputation: 5260

There is no difference between using char[] and StringBuilder and string when concatenating strings

It is said that char[] performs better that StringBuilder and StringBuilder performs better than string in terms of concatenation.

In my test there is no significant difference between using StringBuilder and string inside the loop. In fact the char[] is the slowest.

I am testing against the same table with 44 columns and 130,000 rows the query is select * from test

Can someone help me see if I did something wrong?

The following is the code

//fetchByString(rd, fldCnt, delimiter, sw);            // duration: 3 seconds

//fetchByBuilder(rd, fldCnt, delimiter, sw, rsize);    // duration: 3 seconds

//fetchByCharArray(rd, fldCnt, delimiter, sw, rsize);  // duration: 7 seconds

private void fetchByString(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter)
{
  while (pReader.Read())
  {
    string[] s = new string[pFldCnt];
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        s[j] = "";
      }
      else
      {
        s[j] = pReader.GetValue(j).ToString();          // correct value
      }
    }
    pWriter.WriteLine(string.Join(pDelimiter, s));      
  }
}
private void fetchByBuilder(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter, int pRowSzie)
{
  StringBuilder sb = new StringBuilder(pRowSzie);
  while (pReader.Read())
  {
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        //sb.Append("");
        sb.Append(pDelimiter);
      }
      else
      {
        sb.Append(pReader.GetValue(j).ToString());          // correct value
        sb.Append(pDelimiter);
      }
    }
    pWriter.WriteLine(sb.ToString());
    sb.Clear();
  }
}
private void fetchByCharArray(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter, int pRowSzie)
{
  char[] rowArray;
  int sofar; 
  while (pReader.Read())
  {
    rowArray = new char[pRowSzie];
    sofar = 0;
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        pDelimiter.CopyTo(0, rowArray, sofar, pDelimiter.Length);
        sofar += pDelimiter.Length;
      }
      else
      {
        pReader.GetValue(j).ToString().CopyTo(0, rowArray, sofar, pReader.GetValue(j).ToString().Length);
        sofar += pReader.GetValue(j).ToString().Length;
        pDelimiter.CopyTo(0, rowArray, sofar, pDelimiter.Length);
        sofar += pDelimiter.Length;
      }
    }
    string a = new string(rowArray).TrimEnd('\0');
    pWriter.WriteLine(a);
  }
}

Upvotes: 3

Views: 2728

Answers (2)

dthorpe
dthorpe

Reputation: 36082

StringBuilder is preferred over string concat because string concat frequently has to allocate temporary intermediate copies of the data with each + operator, which chews up a lot of memory fast and requires copying data multiple times. StringBuilder.Append() is internally optimized to avoid copying or allocating the subsegments multiple times. All the work happens at StringBuilder.ToString, when the final size of the output string is known and it can be allocated in one call.

Your test cases aren't using string concat. You allocate a bunch of string fragments into an array of string, and then you call String.Join. That's essentially what StringBuilder does internally. Even after you remove the overhead of data I/O that may be dominating the benchmark times, I would expect String.Join() and StringBuilder.ToString() to produce similar performance.

Upvotes: 5

tmesser
tmesser

Reputation: 7666

I'm not familiar with this claim, but there seems to be WAY more conversions going on in the char[] the way you've written it.

pReader.GetValue().ToString(), besides putting the value in a format that's not what you're working in (string instead of char[]), is happening 3 times in the char[] assignment as opposed to just 1 in the others. You should probably find some way to cast your 'true value' directly to a char[] to be valid. Otherwise from a benchmarking perspective you could theoretically be pulling down performance by introducing slowness from something else. I'm not asserting that's what's happening, but procedurally it's considered important. Even if you can't do that, I think you still might realize a small performance boost if you put in var stringRep = pReader.GetValue().ToString() and used stringRep instead of the associated GetValue/ToString call.

Incidentally, I'm not sure how you're timing this, but if you're not using the Stopwatch class you might look into it, just to be sure your timing is appropriate as well. It's basically made with this sort of benchmarking in mind. This would also allow you to actually isolate what you're trying to benchmark (the concatenation operations) without getting all that mess from the oracle reader in there as well.

Upvotes: 2

Related Questions