c++ optimization of 2 lines of critical code

Question

With valgrind and perf/FlameGraphs I have identified part of my application which is consuming almost 100% of CPU:

for(size_t i = 0; i < objects.size(); i++) {

  //this part consumes 11% CPU -----> 
  collions_count = database->get_collisions(collisions_block, objects[i].getKey());
  feature1 = objects[i].feature1;
  //<--------

  for(int j = 0; j < collions_count * 2; j += 2) {

    hash = 
      ((collisions_block[j] & config::MASK_1) << config::SHIFT) | 
      ((collisions_block[j+1] - feature1) & config::MASK_2);

    if (++offsets[hash] >= config::THRESHOLD_1) {

      //... this part consumes < 1% of CPU

    }
  }
}

The calculation of hash and following if statement take nearly 90% of CPU of all application.

collisions_block is initialized once and is of type int[100000]
config:: is a namespace with variables containing global configuration
offsets is initialized once and is of type uint8_t[1<<24]
I am running Centos7 Linux 3.10.0-327.13.1.el7.x86_64
all CPU is used for usr there is no iowait in mpstat output
I am compiling with g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4) and flags -std=gnu++11 -Ofast -Wall

Is there any way to speed up the inner loop?

c++ optimization of 2 lines of critical code

Answers (1)

Related Questions