Apitronix
Apitronix

Reputation: 273

What really are options of the "read_format" attribute of the "perf_event_attr" structure?

I'm currently using the perf_event_open syscall (on Linux systems), and I try to understand a configuration parameter of this syscall which is given by the struct perf_event_attr structure.

It's about the read_format option. Has anyone can see on the man page of this syscall, this parameter is related to the output of this call.

But I don't understand what every possible argument can do.


Especially these two possibilities:

Can anyone with that information give me a straight answer?

Upvotes: 3

Views: 463

Answers (2)

Alan Nair
Alan Nair

Reputation: 1

The top answer to this question is incorrect.

A CPU has a limited number of performance counters. A typical Intel CPU core has 4 generic PMU counters, plus 3 fixed counters for instructions, cycles, and ref-cycles. So if the number of events to be monitored exceeds the number of available counters, perf does time-based multiplexing of events. In other words, the kernel switches the event monitored by a counter at a frequency of 100 or 1000Hz to give every event a chance to monitor the hardware. So every event is not monitored all the time.

PERF_FORMAT_TOTAL_TIME_ENABLED is the total time for which the event-monitoring was enabled, while PERF_FORMAT_TOTAL_TIME_RUNNING is the total time for which the event was actually monitored. If the number of events being monitored exceeds the number of counters, the latter time will be less than the former. In such cases, the reported value is scaled by time_enabled / time_running to estimate what the value would have been had monitoring been running throughout.

This comment in perf code shows what your perf_read_format structure should look like.

Upvotes: 0

Apitronix
Apitronix

Reputation: 273

Ok.

I've looked a little further, and I think I have found an answer.


  • PERF_FORMAT_TOTAL_TIME_ENABLED: It seems that an "enabled time" refer to the difference between the time the event is no longer observed, and the time the event is registered as "to be observed".

  • PERF_FORMAT_TOTAL_TIME_RUNNING: It seems that an "running time" refer to the sum of the time the event is truly observed by the kernel. It's smaller or equal to PERF_FORMAT_TOTAL_TIME_ENABLED.


For example :

You tell to your kernel that you want to observe the X event at 1:13:05 PM. Your kernel create a "probe" on X, and start to record the activity. Then, for an unknown reason, you tell to stop the record for the moment at 1:14:05 PM. Then, you resume the record at 1:15:05 PM. Finally, you stop the record at 1:15:35 PM.

You have 00:02:30 enabled time (1:15:35 PM - 1:13:05 PM = 00:02:30)

and 00:01:30 running time (1:14:05 PM - 1:13:05 PM + 1:15:35 PM - 1:15:05 PM = 00:01:30)


The read_format attribute can have both values using a mask. In C++, it looks like that :

event_configuration.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING;

where event_configuration is an instance of struct perf_event_attr.

Upvotes: 3

Related Questions