Reputation: 345
I am using Linux Ubuntu, and programming in C++. I have been able to access the performance counters (instruction counts, cache misses etc) using perf_event (actually using programs from this link: https://github.com/castl/easyperf).
However, now I am running a multi-threaded application using pthreads, and need the instruction counts and cycles to completion of each thread separately. Any ideas on how to go about this?
Thanks!
Upvotes: 3
Views: 7138
Reputation: 94205
You can use standard tool to access perf_event
- the perf
(from linux-tools). It can work with all threads of your program and report summary profile and per-thread (per-pid/per-tid) profile.
This profile is not exact hardware counters, but rather result of sampling every N events, with N tuned to be reached around 99 Hz (times per second). You can also try -c 2000000
option to get sample every 2 millions of hardware event. For example, cycles event (full list - perf list
or try some listed in perf stat ./program
)
perf record -e cycles -F 99 ./program
perf record -e cycles -c 2000000 ./program
Summary on all threads. -n
will show you total number of samples
perf report -n
Per pid (actually tids are used here, so it will allow you to select any thread).
Text variant will list all threads recorded with summary sample count (with -c 2000000
you can multiply sample count with 2 million to estimate hw event count for the thread)
perf report -n -s pid | cat
Or ncurses-like interactive variant where you can select any thread and see its own profile:
perf report -n -s pid
Upvotes: 2
Reputation: 3022
perf is a system profiling tool you can use. it's not like https://github.com/castl/easyperf), which is a library and you use it in your code. Following the steps and use it to profile your program:
Install perf
on Ubuntu. The installation could be quite different in different Linux distribution. You can find out the installation tutorial line.
Simply run your program and get all thread id of your program:
ps -eLf | grep [application name]
open separate terminal and run perf as perf stat -t [threadid]
according to man page:
usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
-i, --no-inherit child tasks do not inherit counters
-p, --pid <n> stat events on existing process id
-t, --tid <n> stat events on existing thread id
-a, --all-cpus system-wide collection from all CPUs
-c, --scale scale/normalize counters
-v, --verbose be more verbose (show counter open errors, etc)
-r, --repeat <n> repeat command and print average + stddev (max: 100)
-n, --null null run - dont start any counters
-B, --big-num print large numbers with thousands' separators
there is an analysis article about perf
, you can get a feeling about it.
Upvotes: 4
Reputation: 10199
Please take a look at the perf
tool documentation here, it supports some of the events (eg: both instructions
and cache-misses
) that you're looking to profile. Extract from the wiki page linked above:
The perf tool can be used to count events on a per-thread, per-process, per-cpu or system-wide basis. In per-thread mode, the counter only monitors the execution of a designated thread. When the thread is scheduled out, monitoring stops. When a thread migrated from one processor to another, counters are saved on the current processor and are restored on the new one.
Upvotes: 1