Reputation: 1374
As we all know, perf
is the tool to get the CPU performance counter for a program, such as cache-miss
, cache-reference
, instruction executed
etc.
Question :
How to get those performance counters for just a piece of code (such as a function) in one program in c
or c++
.
For example, my program firstly do some initializing, then do the work, then finalize, i just want to get the performance counter for the work, such as function do_something_1
.
int main(int argc, char ** argv) {
do_initialize();
for (int i = 0;i < 100 ;i ++) {
/* begin profile code */
do_something_1();
/* end profile code */
do_something_2();
}
do_finalize();
}
Upvotes: 7
Views: 7316
Reputation: 51
I did do some survey to solving the same problem in my project. I did find another framework called SkyPat (https://skypat.skymizer.com) which can get the PMU counters for a piece of code like PAPI.
I have tried both of PAPI and SkyPat to get the PMU counters for a function. I think the difference between of them is that SkyPat combines unit tests and perf_evnet. It refers the concept of Google Test and provides an interface to access PMU, so it’s easy to integrate with Google Test.
For example, if you want to measure cache references and cache for a function.
#include <unistd.h>
#include "pat/pat.h"
#include "test.h"
PAT_F(MyCase, my_test)
{
int result = 0;
COUNT(pat::CONTEXT_SWITCHES) {
test(10);
}
COUNT(pat::CPU_CLOCK) {
test(10);
}
COUNT(pat::TASK_CLOCK) {
test(10);
}
COUNT(pat::CACHE_REFERENCES) {
test(10);
}
COUNT(pat::CACHE_MISSES) {
test(10);
}
}
int main(int argc, char* argv[])
{
pat::Test::Initialize(&argc, argv);
pat::Test::RunAll();
}
And the result log of SkyPat.
[ pat ] Running 1 tests from 1 cases.
[----------] 1 test from MyCase.
[ RUN ] MyCase.my_test
[ TIME (ns)] 2537 1000 843 1855 1293
[EVENT TYPE] [CTX SWITCH] [CPU CLOCK] [TASK CLOCK] [CACHE REF] [CACHE MISS]
[RESULT NUM] 0 982 818 2 0
[==========] 1 test from 1 cases ran.
[ PASSED ] 1 test.
Upvotes: 1
Reputation: 1374
Finally, i found a library to get those counter for a piece of code.
For example, if you want to measure L3 data cache read for some piece of code.
#include "papi.h"
#include <iostream>
#include <glog/logging.h>
#define ASIZE 2684354560
#define event_count (1) // the number of event you want to trace
int main(int argc, char ** argv) {
int events[event_count] = {PAPI_L3_DCR}; // L3 Data Cache Read
int ret;
long long int values[event_count]; // result
int* array = new int [ASIZE ];
/* start counters */
ret = PAPI_start_counters(events, event_count);
CHECK_EQ(ret, PAPI_OK);
size_t tot_cnt = 1;
for(size_t cnt = 0; cnt < tot_cnt; cnt ++) {
for(size_t i = 0;i < ASIZE ;i ++) {
array[i] = i;
}
}
/* read counters */
ret = PAPI_read_counters(values, event_count);
CHECK_EQ(ret, PAPI_OK);
for(size_t i = 0;i < event_count ;i ++) {
LOG(INFO) << " " << values[i];
}
return 0;
}
Makefile :
CXX?=g++
INC?=-I<path to where papi is installed>/include/
LIB?=-L<path to where papi is installed>/lib/ -lpapi -lglog
main : main.cpp
${CXX} -O3 ${INC} -o $@ $< ${LIB}
all : main
.PHONY:
clean :
rm -f main
Upvotes: 6
Reputation: 21
I am facing the same situation as yours and I did some study on this. Here is what I learned. Firstly, perf is included as a part of kernel and you could check its headers in
/usr/src/kernels/$VERSION/include/linux/perf_regs.h
/usr/src/kernels/$VERSION/include/linux/perf_event.h
/usr/src/kernels/$VERSION/include/uapi/linux/perf_event.h
And I think the core file is perf_event.h You could also check its github website which has some clarification on how to use it. But it is not clear and now I still have many confusions.
In addition, I found a library very useful called pfmlib which is a helper library to program the perf events. It has examples and perf_examples for instructing how to do this in code-level. I am still working on it. Hope this help you. If you have some questions, we could study from each other.
The website of pfmlib is http://perfmon2.sourceforge.net.
Upvotes: 0
Reputation: 8494
You can use operf (oprofile).
In short:
# Build you program with debugging information
# Start up the profiler
operf /path/to/mybinary
# generate a profile summary
opreport --symbols
# produce some annotated source
opannotate --source --output-dir=/path/to/annotated-source
Example annotated output:
$ opannotate --source --output-dir=/home/moz/src/annotated `which oprofiled`
$ vi /home/moz/src/annotated/home/moz/src/oprofile/daemon/opd_image.c # the annotated source output
...
:static uint64_t pop_buffer_value(struct transient * trans)
254 2.4909 :{ /* pop_buffer_value total: 2105 20.6433 */
: uint64_t val;
:
160 1.5691 : if (!trans->remaining) {
: fprintf(stderr, "BUG: popping empty buffer !\n");
: exit(EXIT_FAILURE);
: }
:
: val = get_buffer_value(trans->buffer, 0);
123 1.2062 : trans->remaining--;
65 0.6374 : trans->buffer += kernel_pointer_size;
: return val;
230 2.2556 :}
Upvotes: 1
Reputation: 25663
It sounds you look for profiling.
As you say you are under linux so have a look for gprof toolchain. Simply you have to compiler your prog with some compiler options and start your program. gprof after that inspect the generated profiling data and provide a result which contains information's for each code block.
First: Compile your prog with additional options:
g++ <source> -c -g -pg
...
Second: Link, you also need these options!
g++ <object1> <object2> ... <objectn> -g -pg -o <target>
Third: run your prog
./<target>
After that, get statistics:
gprof <target>
Upvotes: 0