pgplus1628
pgplus1628

Reputation: 1374

How to get CPU performance counter for a piece of code

As we all know, perf is the tool to get the CPU performance counter for a program, such as cache-miss, cache-reference, instruction executed etc.

Question : How to get those performance counters for just a piece of code (such as a function) in one program in c or c++.
For example, my program firstly do some initializing, then do the work, then finalize, i just want to get the performance counter for the work, such as function do_something_1 .

int main(int argc, char ** argv) {
    do_initialize();
    for (int i = 0;i < 100 ;i ++) {
        /* begin profile code */
        do_something_1();
        /* end profile code */
        do_something_2();
    } 
    do_finalize();
}

Upvotes: 7

Views: 7316

Answers (5)

Pisco
Pisco

Reputation: 51

I did do some survey to solving the same problem in my project. I did find another framework called SkyPat (https://skypat.skymizer.com) which can get the PMU counters for a piece of code like PAPI.

I have tried both of PAPI and SkyPat to get the PMU counters for a function. I think the difference between of them is that SkyPat combines unit tests and perf_evnet. It refers the concept of Google Test and provides an interface to access PMU, so it’s easy to integrate with Google Test.

For example, if you want to measure cache references and cache for a function.

#include <unistd.h>
#include "pat/pat.h"
#include "test.h"

PAT_F(MyCase, my_test)
{
  int result = 0;

  COUNT(pat::CONTEXT_SWITCHES) {
    test(10);
  }
  COUNT(pat::CPU_CLOCK) {
    test(10);
  }
  COUNT(pat::TASK_CLOCK) {
    test(10);
  }
  COUNT(pat::CACHE_REFERENCES) {
    test(10);
  }
  COUNT(pat::CACHE_MISSES) {
    test(10);
  }
}

int main(int argc, char* argv[])
{
  pat::Test::Initialize(&argc, argv);
  pat::Test::RunAll();
}

And the result log of SkyPat.

[    pat   ] Running 1 tests from 1 cases.
[----------] 1 test from MyCase.
[ RUN      ] MyCase.my_test
[ TIME (ns)]         2537         1000          843         1855         1293
[EVENT TYPE] [CTX SWITCH] [CPU  CLOCK] [TASK CLOCK] [CACHE  REF] [CACHE MISS]
[RESULT NUM]            0          982          818            2            0
[==========] 1 test from 1 cases ran.
[  PASSED  ] 1 test.

Upvotes: 1

pgplus1628
pgplus1628

Reputation: 1374

Finally, i found a library to get those counter for a piece of code.

PAPI

For example, if you want to measure L3 data cache read for some piece of code.

#include "papi.h"
#include <iostream>
#include <glog/logging.h>

#define ASIZE 2684354560

#define event_count (1) // the number of event you want to trace

int main(int argc, char ** argv) {

  int events[event_count] = {PAPI_L3_DCR}; // L3 Data Cache Read
  int ret;
  long long int values[event_count]; // result  

  int* array = new int [ASIZE ];

  /* start counters */
  ret = PAPI_start_counters(events, event_count);
  CHECK_EQ(ret, PAPI_OK);

  size_t tot_cnt = 1;
  for(size_t cnt = 0; cnt < tot_cnt; cnt ++) {
    for(size_t i = 0;i < ASIZE ;i ++) {
      array[i] = i;
    }
  }

  /* read counters */
  ret = PAPI_read_counters(values, event_count);
  CHECK_EQ(ret, PAPI_OK);

  for(size_t i = 0;i < event_count ;i ++) {
    LOG(INFO) << " " << values[i];
  }
  return 0;
}

Makefile :

CXX?=g++
INC?=-I<path to where papi is installed>/include/
LIB?=-L<path to where papi is installed>/lib/ -lpapi -lglog

main : main.cpp
  ${CXX} -O3 ${INC} -o $@ $< ${LIB}

all : main

.PHONY:
clean :
  rm -f main

Upvotes: 6

L.Y.
L.Y.

Reputation: 21

I am facing the same situation as yours and I did some study on this. Here is what I learned. Firstly, perf is included as a part of kernel and you could check its headers in

/usr/src/kernels/$VERSION/include/linux/perf_regs.h /usr/src/kernels/$VERSION/include/linux/perf_event.h /usr/src/kernels/$VERSION/include/uapi/linux/perf_event.h

And I think the core file is perf_event.h You could also check its github website which has some clarification on how to use it. But it is not clear and now I still have many confusions.

In addition, I found a library very useful called pfmlib which is a helper library to program the perf events. It has examples and perf_examples for instructing how to do this in code-level. I am still working on it. Hope this help you. If you have some questions, we could study from each other.

The website of pfmlib is http://perfmon2.sourceforge.net.

Upvotes: 0

doqtor
doqtor

Reputation: 8494

You can use operf (oprofile).

In short:

# Build you program with debugging information
# Start up the profiler
operf /path/to/mybinary
# generate a profile summary
opreport  --symbols
# produce some annotated source
opannotate --source --output-dir=/path/to/annotated-source

Example annotated output:

$ opannotate --source --output-dir=/home/moz/src/annotated `which oprofiled`
$ vi /home/moz/src/annotated/home/moz/src/oprofile/daemon/opd_image.c # the annotated source output
...
               :static uint64_t pop_buffer_value(struct transient * trans)
   254  2.4909 :{ /* pop_buffer_value total:   2105 20.6433 */
               :        uint64_t val;
               :
   160  1.5691 :        if (!trans->remaining) {
               :                fprintf(stderr, "BUG: popping empty buffer    !\n");
               :                exit(EXIT_FAILURE);
               :        }
               :
               :        val = get_buffer_value(trans->buffer, 0);
   123  1.2062 :        trans->remaining--;
    65  0.6374 :        trans->buffer += kernel_pointer_size;
               :        return val;
   230  2.2556 :}

Examples

Upvotes: 1

Klaus
Klaus

Reputation: 25663

It sounds you look for profiling.

As you say you are under linux so have a look for gprof toolchain. Simply you have to compiler your prog with some compiler options and start your program. gprof after that inspect the generated profiling data and provide a result which contains information's for each code block.

First: Compile your prog with additional options:

g++ <source> -c -g -pg
...

Second: Link, you also need these options!

g++ <object1> <object2> ... <objectn> -g -pg -o <target>

Third: run your prog

./<target>

After that, get statistics:

gprof <target>

Upvotes: 0

Related Questions