Anthony Wood
Anthony Wood

Reputation: 395

Performance issues with nested loops and string concatenations

Can someone please explain why this code is taking so long to run (i.e. >24 hours): The number of rows is 5000, whilst the number of columns is 2000 (i.e. Approximately 10m loops).

Is there a better way to do this????

for (int i = 0; i < m.rows; i++)
{
    for (int j = 0; j < m.cols; j++)
    {
        textToWrite += m[i, j].ToString() + ",";
    }
    //remove the final comma.
    textToWrite = textToWrite.Substring(0,textToWrite.Length-2);
    textToWrite += Environment.NewLine;
}

Upvotes: 0

Views: 373

Answers (8)

Liath
Liath

Reputation: 10191

The biggest issue I see with this is the fact you're using textToWrite as a string.

As strings are immutable so each time the string is changed new memory must be reserved copied from the previous version.

A far more efficient approach is to use the StringBuilder class which is designed for exactly this type of scenario. For example:

StringBuilder sb = new StringBuilder();
for (int i = 0; i < m.rows; i++)
{
    for (int j = 0; j < m.cols; j++)
    {
        sb.Append(m[i, j].ToString());
        if(j < m.cols - 1) // don't add a comma on the last element
        {
          sb.Append(",");
        }
    }
    sb.AppendLine();
}

Upvotes: 2

greg84
greg84

Reputation: 7599

Yes, the += operator is not very efficient. Use StringBuilder instead.

In the .NET framework a string is immutable, which means it cannot be modified in place. This means the += operator has to create a new string every time, which means allocating memory, copying the value of the existing string and writing it to the new location. It's ok for one or two concatenations, but as soon as you put it in a loop you need to use an alternative.

http://support.microsoft.com/kb/306822

You'll see a massive performance improvement by using the following code:

var textToWriteBuilder = new StringBuilder();

for (int i = 0; i < m.rows; i++)
{
    for (int j = 0; j < m.cols; j++)
    {
        textToWriteBuilder.Append(m[i, j].ToString() + ",");
    }

    // I've modified the logic on the following line, I assume you want to 
    // concatenate the value instead of overwriting it as you do in your question.
    textToWriteBuilder.Append(textToWriteBuilder.Substring(0, textToWriteBuilder.Length - 2));
    textToWriteBuilder.Append(Environment.NewLine);
}

string textToWrite = textToWriteBuilder.ToString();

Upvotes: 6

Guffa
Guffa

Reputation: 700152

The reason that it takes so long to run is because you are using string concatenation to create a string. For each iteration it will copy the entire string to a new string, so in the end you will have copied strings that adds up to several million times the final string.

Use a StringBuilder to create the string:

StringBuilder textToWrite = new StringBuilder();
for (int i = 0; i < m.rows; i++)
{
    for (int j = 0; j < m.cols; j++)
    {
        if (j > 0) textToWrite.Append(',');
        textToWrite.Append(m[i, j]);
    }
    textToWrite.AppendLine();
}

Upvotes: 0

Patrick Hofman
Patrick Hofman

Reputation: 156918

Because you are creating tons of strings.

You should use StringBuilder for this.

StringBuilder sb = new StringBuildeR();

for (int i = 0; i < m.rows; i++)
{
    bool first = true;

    for (int j = 0; j < m.cols; j++)
    {
        sb.Append(m[i, j]);

        if (first)
        {
            first = false;
        }
        else
        {
            sb.Append(",");
        }
    }

    sb.AppendLine();
}

string output = sb.ToString();

Upvotes: 2

Theodoros Chatzigiannakis
Theodoros Chatzigiannakis

Reputation: 29213

Your code is taking so long because you're appending strings, creating thousands of new temporary strings as you go. The memory manager needs to find memory for these strings (which increase in memory requirements, as they get longer) and the operation copies the characters you have so far (the number of which increases with every iteration) to the newest string.

The alternative is to use a single StringBuilder, on which you call Append() to append more efficiently and, finally, ToString() when you're done to get the finalized string that you want to use.

Upvotes: 2

usr
usr

Reputation: 171178

Assume the matrix is of size MxM and has N elements. You are building the string in a way that takes O(N^2) (or O(M^4)) in the number of iterations. Each operation must copy what's already there. The issue is not some constant-factor overhead like temporary strings.

Use StringBuilder.

String concatenation is more efficient for small number of concatenated strings. For a dynamic number of strings, use StringBuilder.

Upvotes: 1

Niels Keurentjes
Niels Keurentjes

Reputation: 41958

Use a StringBuilder instead of several million concatenations.

If you concatenate 2 strings, this means the system allocates new memory to contain both of them, and then copies both in. A zillion large memory allocations and copy actions become slow very fast.

What StringBuilder does is reduce this immensely by allocating 'in advance', thus only having to grow the buffer a few times and just copying it in, eliminating the by far slowest factor of your loop.

Upvotes: 1

JeffRSon
JeffRSon

Reputation: 11166

Supposing that textToWrite is a String, you should use StringBuilder instead. String is immutable and it is very ineffective to add small parts.

Ideally you would initialize StringBuilder with a reasonable size (see doc).

Upvotes: 1

Related Questions