possible misuse of perf_event_open syscall

Question

I am experimenting with PERF_EVENTS,a performance event interface provided by the Linux kernel. I was successfully in getting performances parameter(cpu cycles,...) through perf_event_open syscall.

long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
                int cpu, int group_fd, unsigned long flags)
{
    int ret;

   ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
                   group_fd, flags);
    return ret;
}

int
main(int argc, char **argv)
{
   struct perf_event_attr pe;
   long long count;
   int fd;

   memset(&pe, 0, sizeof(struct perf_event_attr));
   pe.type = PERF_TYPE_HARDWARE;
   pe.size = sizeof(struct perf_event_attr);
   pe.config = PERF_COUNT_HW_CPU_CYCLES;
   pe.disabled = 1;
   pe.exclude_idle = 1;
   pe.exclude_kernel = 1;
   pe.exclude_callchain_kernel = 1;

   fd = perf_event_open(&pe, 0, -1, -1, 0);
   if (fd == -1) {
       fprintf(stderr, "Error opening leader %llx
", pe.config);
       exit(EXIT_FAILURE);
   }

   ioctl(fd, PERF_EVENT_IOC_RESET, 0);
   ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

   printf("Measuring instruction count for this printf
");

   ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
   read(fd, &count, sizeof(long long));

   printf("%lld 
", count);

   return 0;
}

However, I don't understand fully the use of perf_event_open. I am passing blindly the -1 as the 4th parameter. I don't when to group events, when to separate them, which one of them should be the group "leader".

below is the documentation of the 4th parameter:

The group_fd argument allows event groups to be created. An event group has one event which is the group leader. The leader is created first, with group_fd = -1. The rest of the group members are created with subsequent perf_event_open() calls with group_fd being set to the fd of the group leader. (A single event on its own is created with group_fd = -1 and is considered to be a group with only 1 member.) An event group is scheduled onto the CPU as a unit: it will only be put onto the CPU if all of the events in the group can be put onto the CPU. This means that the values of the member events can be meaningfully compared, added, divided (to get ratios), etc., with each other, since they have counted events for the same set of executed instructions.

So can any one put some light on the 4th(and if possible it's relation with the 5th)? what is the proper way of doing things? also an example will make things much better.

possible misuse of perf_event_open syscall

Answers (1)

Related Questions