kimsay
kimsay

Reputation: 321

perf tool output, magic values

I ran perf with the parameter -x to print in machine readable format. The output is as follows:

1285831153,,instructions,1323535732,100.00
7332248,,branch-misses,1323535732,100.00
1316.587352,,cpu-clock,1316776510,100.00
1568113343,,cycles,1323535732,100.00

the first number is clear but then the values after the descriptions are not clear to me. Is the first one behind the description the runtime? Then why is it different? What does the 100.00 mean at the end of each line? It is not documented; I looked it up here: https://perf.wiki.kernel.org/index.php/Tutorial#Machine_readable_output

Upvotes: 0

Views: 112

Answers (1)

osgx
osgx

Reputation: 94225

-x option of stat command is implemented in tools/perf/builtin-stat.c file as csv_output flag, and printing is static void printout function "(line 1061). Last values in the string are probably from:

    print_noise(counter, noise);
    print_running(run, ena);

With single run of target program (no -r 5 or -r 2 options - https://perf.wiki.kernel.org/index.php/Tutorial#Repeated_measurement) print_noise will not print anything. And print_running is printing the "run" argument twice, as value and as percentage of ena

static void print_running(u64 run, u64 ena)
{
    if (csv_output) {
        fprintf(stat_config.output, "%s%" PRIu64 "%s%.2f",
                    csv_sep,
                    run,
                    csv_sep,
                    ena ? 100.0 * run / ena : 100.0);
    } else if (run != ena) {
        fprintf(stat_config.output, "  (%.2f%%)", 100.0 * run / ena);
    }
}

You have run/ena = 1 (100.00%), so theses field have no useful information for you.

They are used in the case of event multiplexing (try perf stat -d or perf stat -dd; https://perf.wiki.kernel.org/index.php/Tutorial#multiplexing_and_scaling_events) when user ask perf to measure more event that can be enabled at same time (8 hardware events on intel with only 7 real hardware counting hardware units). Perf (perf_events subsystem of kernel) will enable some subsets of events and will change these subsets several times per second. Then run/ena will be proportional to the time share when this event was enabled, and run will probably show exact time amount when the event was counted. With normal human-readable perf stat this is marked when there is no [100%] for the event line; and the reported event count may be scaled (estimated) for the full running time of the program (inexact scaled).

Upvotes: 1

Related Questions