Eugene
Eugene

Reputation: 61

Pointer in for loop in c program

I have the following structure:

struct Matrix
{
    int     numOfRows;
    int     numOfColumns;
    double* values;
    int*    permutationVector;
}

Also I have the following function:

void SetRowToZero(Matrix* m, int row)
{
    int rowBegin = row*(m->numOfColumns);
    for (int c = 0; c < (m->numOfColumns); c++)
    {
        m->values[rowBegin + c] = 0;
    }
}

I wonder is there any performance slowdown during operation c < (m->numOfColumns)? Is there any difference if i write function like this:

void SetRowToZero(Matrix* m, int row)
{
    // Unpacking structure
    int numOfColumns = m->numOfColumns;
    double* values = m->values;

    int rowBegin = row*(m->numOfColumns);

    for (int c = 0; c < numOfColumns; c++)
    {
        values[rowBegin + c] = 0;
    }
}

And in general, should i even care about performace on such a small scale?

Upvotes: 1

Views: 124

Answers (4)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726579

There should be no performance issues to speak about: optimizing compilers should be able to turn your first code fragment into a second code fragment, store the pointer in a register, optimize out the index c, or use an addressing mode that computes offset+index in hardware.

Note: If you prefer pointer arithmetic, you could rewrite your loop without using indexes. Performance of this code would be similar to that of your original two blocks of code, but it translates almost directly to a simple block of assembly code with pointers stored in registers:

void SetRowToZero(Matrix* m, int row) {
    double *rowPtr = m->values + (row*(m->numOfColumns));
    double *pastEndPtr = rowPtr + m->numOfColumns;
    while (rowPtr != pastEndPtr) {
        *rowPtr++ = 0;
    }
}

Also note that all three implementations are not thread-safe: if the value of m->numOfColumns or m->values is changed in the middle of the loop by another thread, you would end up with behavior that is very likely to be undefined, and definitely unexpected.

Upvotes: 4

Efi Weiss
Efi Weiss

Reputation: 650

Compiler optimizations will result in both pieces of code being optimized to use a register for that value. I tested this little program on x86_64 and compiled with gcc 5.4 with level 4 optimizations

#include <stdlib.h>

struct Matrix
{
    int     numOfRows;
    int     numOfColumns;
    double* values;
    int*    permutationVector;
};

void SetRowToZero1(struct Matrix* m, int row)
{
    int rowBegin = row*(m->numOfColumns);
    for (int c = 0; c < (m->numOfColumns); c++)
    {
        m->values[rowBegin + c] = 0;
    }
}


void SetRowToZero2(struct Matrix* m, int row)
{
    // Unpacking structure
    int numOfColumns = m->numOfColumns;
    double* values = m->values;

    int rowBegin = row*(m->numOfColumns);

    for (int c = 0; c < numOfColumns; c++)
    {
        values[rowBegin + c] = 0;
    }
}

int main() {
    struct Matrix matrix = {5,1000000, malloc(5 * 1000000 * sizeof(double)), NULL};
    SetRowToZero1(&matrix, 1);
}

I compiled it:

gcc -O4 main.c -o test1.out

and then i changed

SetRowToZero1(&matrix, 1);

to

SetRowToZero2(&matrix, 1);

compiled:

gcc -O4 main.c -o test2.out

then: $md5sum test1.out test2.out

504fb75e97173a6864750f5feb7cea58 test12.out

504fb75e97173a6864750f5feb7cea58 test1.out

So you can say with certainty the implementations make no difference :)

Upvotes: 4

Bathsheba
Bathsheba

Reputation: 234715

Maxim: Readability wins over performance.

In other words, computers are cheap, programmers are not.


There would be no difference in this case, since the loop conditional is a simple member access.

It could be a different matter if the loop conditional were a function though. But in that case you could run the loop backwards:

for (int c = <expensive function> - 1 c >= 0; --c)
{
    m->values[rowBegin + c] = 0;
}

taking extra care if c was an unsigned type.

Upvotes: 3

dbush
dbush

Reputation: 223917

There's no difference between these two pieces of code. They do the same thing, with one using intermediate variables.

If it makes sense from a readability standpoint to use temporaries then use them, otherwise don't.

Upvotes: 2

Related Questions