Dervin Thunk
Dervin Thunk

Reputation: 20140

Optimizing C code with lots of dereferences

Linux perf and GPerfTools pprof give me a lot of stalled frontend and backend cycles at this piece of code:

for(j = oL; j < oR; j++) {
   T[j].v = (T[j].x / div)*K+T[j].y /div;
   T[j].x = T[j].x % div;
   T[j].y = T[j].y % div;        
   counterK[T[j].k]++;
  }

gives me:

 9.973.660.617 stalled-cycles-frontend   #   42,16% frontend cycles idle   
 4.874.722.502 stalled-cycles-backend    #   20,60% backend  cycles idle   

I understand that stalled cycles mean that the instruction pipeline cannot really advance, waiting for (maybe) some data from memory. I can see that there are a lot of dereferencing poniters to struct members in that piece of code, and that may be a problem, but I'm afraid I lack sufficient knowledge to see any optimization there. Could anyone help?

Upvotes: 1

Views: 88

Answers (1)

Jens Gustedt
Jens Gustedt

Reputation: 78973

Dereferencing struct pointers by itself should not be a problem on modern architectures. They can do relative addressing and cope quite well with such accesses. Also aliasing as mentioned in one comment shouldn't be a problem. *T and *counterK have different types, so C will never suppose that they alias.

Often, for such loops as you are showing us the processor/memory bandwidth is the bottelneck and not the speed of the processor. So may be, you are just at the limit that your memory can serve.

You are accessing T sequentially, this is already the best you can do for it. The only optimization possible for T would be that you have a lot of fields in it that are not used by the loop that you are showing us. Then you might be wasting processor / memory bandwidth. Then compacting things in an array that has just the information that is needed might help.

For counterK things are more complicated, because the information that you give use has no hint about the access pattern or the size of the array. If counterK is large (much larger than your L1 cache) and your access is very irregular, the stalls could come from here.

Upvotes: 1

Related Questions