Reputation: 9
I need to write a program which will find out the performance/property of other programs.
For example, lets say, I want to know what are the functions are getting called when i give top command in linux. I need to know the which function in top command is called first and which function is responsible for printing output.
Or it can be thought in another way, I want to know what are the functions are getting called in a program, in which order,and how much time it spend for its execution.
Upvotes: 0
Views: 52
Reputation: 1182
If you're just interested in system calls you can use a trace tool like strace:
strace -tt -o strace.log -C top
Will run "top" and log system calls including timings to the file "strace.log", including a summary at the end showing how much time was spent doing various calls. But this will of course not capture what happens in the application code itself - for that you need real profiling like described in the previous response. for a quick-and-dirty view of what happens in a running binary, though, a trace program can be a good tool.
Upvotes: 0
Reputation: 25459
There are two popular approaches to accumulate profiling information about a program. You can either use a dedicated profiler or manually instrument your code with calls to clock
. In both cases you will have to re-compile your code but only in the second, you will have to modify your source code.
If you are using the GNU toolchain, you are looking for Gprof. It comes together with GCC and does just what you've asked for. It works in three steps:
-pg
. This will cause GCC to instrument your binary with special instructions that accumulate the profiling statistics.gprof main > profile.txt
where you substitute your executable name for main
. This will output human-readable profiling data to the file profile.txt
.The file contains two sections: the “flat profile” and the “call graph”. Quoting the manual:
The flat profile shows the total amount of time your program spent executing each function. […]
This is part of a flat profile for a small program:
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 33.34 0.02 0.02 7208 0.00 0.00 open 16.67 0.03 0.01 244 0.04 0.12 offtime 16.67 0.04 0.01 8 1.25 1.25 memccpy 16.67 0.05 0.01 7 1.43 1.43 write 16.67 0.06 0.01 mcount 0.00 0.06 0.00 236 0.00 0.00 tzset 0.00 0.06 0.00 192 0.00 0.00 tolower 0.00 0.06 0.00 47 0.00 0.00 strlen 0.00 0.06 0.00 45 0.00 0.00 strchr 0.00 0.06 0.00 1 0.00 50.00 main 0.00 0.06 0.00 1 0.00 0.00 memcpy 0.00 0.06 0.00 1 0.00 10.11 print 0.00 0.06 0.00 1 0.00 0.00 profil 0.00 0.06 0.00 1 0.00 50.00 report ...
The functions are sorted first by decreasing run-time spent in them, then by decreasing number of calls, then alphabetically by name. The functions
mcount
andprofil
are part of the profiling apparatus and appear in every flat profile; their time gives a measure of the amount of overhead due to profiling.
And quoting the manual again:
The call graph shows how much time was spent in each function and its children. From this information, you can find functions that, while they themselves may not have used much time, called other functions that did use unusual amounts of time.
Here is a sample call from a small program. This call came from the same gprof run as the flat profile example in the previous section.
granularity: each sample hit covers 2 byte(s) for 20.00% of 0.05 seconds index % time self children called name <spontaneous> [1] 100.0 0.00 0.05 start [1] 0.00 0.05 1/1 main [2] 0.00 0.00 1/2 on_exit [28] 0.00 0.00 1/1 exit [59] ----------------------------------------------- 0.00 0.05 1/1 start [1] [2] 100.0 0.00 0.05 1 main [2] 0.00 0.05 1/1 report [3] ----------------------------------------------- 0.00 0.05 1/1 main [2] [3] 100.0 0.00 0.05 1 report [3] 0.00 0.03 8/8 timelocal [6] 0.00 0.01 1/1 print [9] 0.00 0.01 9/9 fgets [12] 0.00 0.00 12/34 strncmp <cycle 1> [40] 0.00 0.00 8/8 lookup [20] 0.00 0.00 1/1 fopen [21] 0.00 0.00 8/8 chewtime [24] 0.00 0.00 8/16 skipspace [44] ----------------------------------------------- [4] 59.8 0.01 0.02 8+472 <cycle 2 as a whole> [4] 0.01 0.02 244+260 offtime <cycle 2> [7] 0.00 0.00 236+1 tzset <cycle 2> [26] -----------------------------------------------
The lines full of dashes divide this table into entries, one for each function. Each entry has one or more lines.
In each entry, the primary line is the one that starts with an index number in square brackets. The end of this line says which function the entry is for. The preceding lines in the entry describe the callers of this function and the following lines describe its subroutines (also called children when we speak of the call graph).
The entries are sorted by time spent in the function and its subroutines.
I hope this could give you an idea and recommend you read through the manual yourself to find out how to use Gprof effectively. A final great thing about Gprof is that you can feed the statistics it gathered back into GCC to let it optimize your code better. This is known as profile guided optimization (PGO).
If you insist in not using tools like Gprof, you can of course do things by hand but I recommend you stay away from this if all possible. Why would you re-invent a flat tire if you can have a wheel for free? Of course, there are tradeoffs for everything and in some situations, you might only want to profile a few functions and the manual approach might be warranted or even superior. (Often times, it won't.)
Anyway, you could decide what functions are interesting to you. For example, let's assume we have the functions foo
, bar
and baz
that we want to profile. You start by numbering them, preferably in an enumeration.
enum ProfiledFunctions {
PROF_FOO,
PROF_BAR,
PROF_BAZ,
PROF_LENGTH, // dummy, counts the number of functions
};
Then you create a global array to record the time spent in each function.
#include <time.h>
static clock_t profile_data[PROF_LENGTH];
And instrument all your functions.
int
foo(int a, int b)
{
#if COLLECT_PROFILING_DATA
clock_t t_begin, t_end;
t_begin = clock();
#endif
// Do useful stuff with 'a' and 'b'...
#if COLLECT_PROFILING_DATA
t_end = clock();
profile_data[PROF_FOO] += t_end - t_begin;
#endif
return 0;
}
Of course, you do the same to bar
and baz
as well. You might want to use some preprocessor magic to reduce the amount of typing redundant boilerplate code.
I have surrounded the profiling logic with conditionals so we can compile them in and out as we see fit (using -DCOLLECT_PROFILING_DATA=1
on the compiler command line). The last thing you want to do is going over your code more than once adding / removing this boilerplate logic. (The second last thing you want to do is doing this even once which lets me refer back to tools like Gprof.)
Finally, you go and change your main
.
int
main()
{
#if COLLECT_PROFILING_DATA
clock_t t_begin, t_end;
t_begin = clock();
#endif
// Do what your program is supposed to do...
#if COLLECT_PROFILING_DATA
t_end = clock();
fprintf(stderr, "foo: %ju\n", (uintmax_t) profile_data[PROF_FOO]);
fprintf(stderr, "bar: %ju\n", (uintmax_t) profile_data[PROF_BAR]);
fprintf(stderr, "baz: %ju\n", (uintmax_t) profile_data[PROF_BAZ]);
fprintf(stderr, "main: %ju\n", (uintmax_t) (t_end - t_begin));
#endif
return 0;
}
I'm also printing the time spent in main
– that is, the program as a whole – so you can tell whether the function you have chosen to profile actually cover most of the execution time. In this timing, I have excluded the profiling overhead for printing the statistics themselves as it would be misleading.
The statistics gathered by this approach will be cumulative. If foo
calls bar
(and nobody else calls bar
) then the time reported for foo
will include all time spent in bar
plus whatever else foo
did. If a function calls itself recursively, you will get false data, leave alone with functions that call longjmp
or similar. If your functions do this kind of thing, you really want to use a dedicated profiler tool.
Upvotes: 1