Reputation: 61
I have the following structure:
struct Matrix
{
int numOfRows;
int numOfColumns;
double* values;
int* permutationVector;
}
Also I have the following function:
void SetRowToZero(Matrix* m, int row)
{
int rowBegin = row*(m->numOfColumns);
for (int c = 0; c < (m->numOfColumns); c++)
{
m->values[rowBegin + c] = 0;
}
}
I wonder is there any performance slowdown during operation c < (m->numOfColumns)
?
Is there any difference if i write function like this:
void SetRowToZero(Matrix* m, int row)
{
// Unpacking structure
int numOfColumns = m->numOfColumns;
double* values = m->values;
int rowBegin = row*(m->numOfColumns);
for (int c = 0; c < numOfColumns; c++)
{
values[rowBegin + c] = 0;
}
}
And in general, should i even care about performace on such a small scale?
Upvotes: 1
Views: 124
Reputation: 726579
There should be no performance issues to speak about: optimizing compilers should be able to turn your first code fragment into a second code fragment, store the pointer in a register, optimize out the index c
, or use an addressing mode that computes offset+index
in hardware.
Note: If you prefer pointer arithmetic, you could rewrite your loop without using indexes. Performance of this code would be similar to that of your original two blocks of code, but it translates almost directly to a simple block of assembly code with pointers stored in registers:
void SetRowToZero(Matrix* m, int row) {
double *rowPtr = m->values + (row*(m->numOfColumns));
double *pastEndPtr = rowPtr + m->numOfColumns;
while (rowPtr != pastEndPtr) {
*rowPtr++ = 0;
}
}
Also note that all three implementations are not thread-safe: if the value of m->numOfColumns
or m->values
is changed in the middle of the loop by another thread, you would end up with behavior that is very likely to be undefined, and definitely unexpected.
Upvotes: 4
Reputation: 650
Compiler optimizations will result in both pieces of code being optimized to use a register for that value. I tested this little program on x86_64 and compiled with gcc 5.4 with level 4 optimizations
#include <stdlib.h>
struct Matrix
{
int numOfRows;
int numOfColumns;
double* values;
int* permutationVector;
};
void SetRowToZero1(struct Matrix* m, int row)
{
int rowBegin = row*(m->numOfColumns);
for (int c = 0; c < (m->numOfColumns); c++)
{
m->values[rowBegin + c] = 0;
}
}
void SetRowToZero2(struct Matrix* m, int row)
{
// Unpacking structure
int numOfColumns = m->numOfColumns;
double* values = m->values;
int rowBegin = row*(m->numOfColumns);
for (int c = 0; c < numOfColumns; c++)
{
values[rowBegin + c] = 0;
}
}
int main() {
struct Matrix matrix = {5,1000000, malloc(5 * 1000000 * sizeof(double)), NULL};
SetRowToZero1(&matrix, 1);
}
I compiled it:
gcc -O4 main.c -o test1.out
and then i changed
SetRowToZero1(&matrix, 1);
to
SetRowToZero2(&matrix, 1);
compiled:
gcc -O4 main.c -o test2.out
then: $md5sum test1.out test2.out
504fb75e97173a6864750f5feb7cea58 test12.out
504fb75e97173a6864750f5feb7cea58 test1.out
So you can say with certainty the implementations make no difference :)
Upvotes: 4
Reputation: 234715
Maxim: Readability wins over performance.
In other words, computers are cheap, programmers are not.
There would be no difference in this case, since the loop conditional is a simple member access.
It could be a different matter if the loop conditional were a function though. But in that case you could run the loop backwards:
for (int c = <expensive function> - 1 c >= 0; --c)
{
m->values[rowBegin + c] = 0;
}
taking extra care if c
was an unsigned
type.
Upvotes: 3
Reputation: 223917
There's no difference between these two pieces of code. They do the same thing, with one using intermediate variables.
If it makes sense from a readability standpoint to use temporaries then use them, otherwise don't.
Upvotes: 2