Reputation: 43
I am using 'perf record' command to sample hardware counters at 1 ms. It provides me a 'perf.data' as the output file but I am not aware of any tool/command that will help me to read the counter data from the 'perf.data' binary file into a text or CSV file. Or simply put, I need to read the hardware counter event data at every 1ms from the 'perf.data' file.
Some more Details:
I have used 'perf stat' command to get hardware counter event data at 10ms but it doesn't allow sampling at sampling interval less than 10ms. So, I am using 'perf record' instead of perf stat to sample at 1ms. Some useful links which convinced me to use perf record: Perf Stat vs Perf Record and Collecting the data for a partiulcar process from PMU for every 1 milli second
I have also tried 'perf script' but it only provides support some hardware events. For example: cache events are not supported by perf script. link: Can't sample hardware cache events with linux perf
Can anyone help me with this, please? Please assume that I know how to use perf record command and already have the perf.data file(generated from perf record)
Edited: Following are the commands and their output message on the terminal using the feedback from the answers:
output:
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (28 samples) ]
output:
Total Lost Samples: 0
Samples: 9 of event 'LLC-stores'
Event count (approx.): 7440
Samples Period
............ ............
9 7440
Samples: 9 of event 'LLC-loads'
Event count (approx.): 50008
Samples Period
............ ............
9 50008
Samples: 10 of event 'cache-misses'
Event count (approx.): 351826
Samples Period
............ ............
10 351826
output:
1 LLC-stores:
1 LLC-loads:
1 cache-misses:
1 LLC-loads:
1 cache-misses:
61 LLC-loads:
58 cache-misses:
3097 cache-misses:
1 LLC-stores:
13 LLC-stores:
4748 LLC-loads:
1390 LLC-stores:
190186 cache-misses:
1 LLC-stores:
1 LLC-loads:
1 cache-misses:
1 LLC-loads:
1 cache-misses:
1 LLC-stores:
52 cache-misses:
50 LLC-loads:
20 LLC-stores:
4110 cache-misses:
2002 LLC-loads:
748 LLC-stores:
154319 cache-misses:
43143 LLC-loads:
5265 LLC-stores:
output:
time counts unit events
0.006476856 1,115 LLC-stores
0.006476856 13,121 LLC-loads
0.006476856 9,371 cache-misses
Both perf report and perf script provide number of samples,period and event name but not the event count for each sample. It would be really helpful if you could tell me how to get the event count for the each 28 samples that we get from perf record.
Upvotes: 4
Views: 10805
Reputation: 2431
You should use perf record -e <event-name> ...
to sample events every 1ms. It seems you are trying to read the perf.data
file and organize it into human-readable data. You should use perf report
if you are not aware of it. The perf report
command reads the perf.data
file and generates a concise execution profile. The below link should help you -
Sample analysis with perf report
You can modify the perf report
output to your requirements. You can also use perf report -F
to specify multiple columns in csv format.
However, in addition, perf stat
does have a mechanism to collect information in a csv format using the perf stat -x
command.
Edit #1:
(I am using Linux-Kernel 4.14.3 for evaluation.)
Since you want the number of events per sample taken, there are couple of things to be noted. To count the number of events per sample, you will need to know the sampling period. The sampling period gives you the number of events after which the performance counter will overflow and the kernel will record a sample. So essentially, in your case,
sampling period = number of events per sample
Now there are two ways of specifying this sampling period. Either you specify it or you do not specify it.
If while doing a perf record
, you specify the sampling period.. something like this :-
perf record -e <some_event_name> -c 1000 ...
Here -c 1000 means that the sampling period is 1000. In this case, you purposefully force the system to record 1000 events per sample because the sampling period is fixed by you.
On the other hand, if you do not specify the sampling period, the system will try to record events at a default frequency of 1000 samples/sec. This means that the system will automatically change the sampling period, if need be, to maintain the frequency of 1000 samples/sec. In such a case, to determine the sampling period, you need to observe the perf.data
file.
Specifically, you need to open the perf.data
file using the command :
perf script -D
The output will very well look like this :-
0 0 0x108 [0x38]: PERF_RECORD_FORK(1:1):(0:0)
0x140 [0x30]: event: 3
.
. ... raw event: size 48 bytes
. 0000: 03 00 00 00 00 00 30 00 01 00 00 00 01 00 00 00 ......0.........
. 0010: 73 79 73 74 65 6d 64 00 00 00 00 00 00 00 00 00 systemd.........
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0 0 0x140 [0x30]: PERF_RECORD_COMM: systemd:1/1
0x170 [0x38]: event: 7
.
. ... raw event: size 56 bytes
. 0000: 07 00 00 00 00 00 38 00 02 00 00 00 00 00 00 00 ......8.........
. 0010: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0030: 00 00 00 00 00 00 00 00 ........
You can see different types of records like PERF_RECORD_FORK
and PERF_RECORD_COMM
and even PERF_RECORD_MMAP
. You need to specifically look out for records that begin with PERF_RECORD_SAMPLE inside the file. Like this:
14 173826354106096 0x10d40 [0x28]: PERF_RECORD_SAMPLE(IP, 0x1): 28179/28179: 0xffffffffa500d3b5 period: 3000 addr: 0
... thread: perf:28179
...... dso: [kernel.kallsyms]
perf 28179 [014] 173826.354106: cache-misses: ffffffffa500d3b5 [unknown] ([kernel.kallsyms])
As you can see, in this case the period is 3000 i.e. number of events collected between the previous sampling event and this sampling event is 3000. (i.e. number of events per sample is 3000) Note that, as I mentioned above this period might be tuned. So you need to collect all of the PERF_RECORD_SAMPLE records from the perf.data
file.
Upvotes: 2