Metal Slime
Metal Slime

Reputation: 149

Where to find perf event document

In the question cpu cache performance. store misses vs load misses, there is no answer about where to find documents of events listed by perf list

I can't find it by man perf and perf help list,

I read the Event document of Intel@64 and AMD64, which the event format looks like the following Last Level Cache References — Event select 2EH, Umask 4FH So where is it?

Edit: To be clear, I want to look for the document of the event list by perf list

Upvotes: 2

Views: 1744

Answers (1)

osgx
osgx

Reputation: 94465

List of predefined perf events like branches cycles LLC-load-misses is documented by the source code of perf subsystem inside Linux kernel. The list is mapped partially and to various hardware event for different CPU models and microarchitectures. It can be more useful to use ocperf.py (and toplev.py) from andikleen's pmu-tools (if your CPU is Intel) with event names from Intel documentations (ocperf is not official, but it is written by Intel employee and uses official lists from https://download.01.org/perfmon/ https://download.01.org/perfmon/readme.txt "This package contains performance monitoring event lists for Intel processors")

For x86 and x86_64 perf these (ancient) predefined/generic names are mapped at arch/x86/events directory, for example for all Intel Core microarchitecures check arch/x86/events/intel/core.c and search for microarchitecture by its code name (Core, Core2, NHM=Nehalem, WSM=Westmere, SNB=SandyBridge, IVB=IvyBridge, HSW=HaSWell, BDW=BroaDWell,SKL=SKyLake, SLM=SiLverMont and other from lists and amd). For Skylake there is structure at line 394 of intel/core.c of 4.15.8, and we see that PREFETCH counters are not mapped for all caches ("not supported")

 static __initconst const u64 skl_hw_cache_event_ids

 [ C(L1D ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = 0x81d0,  /* MEM_INST_RETIRED.ALL_LOADS */
        [ C(RESULT_MISS)   ] = 0x151,   /* L1D.REPLACEMENT */
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = 0x82d0,  /* MEM_INST_RETIRED.ALL_STORES */
        [ C(RESULT_MISS)   ] = 0x0,

...
 [ C(LL  ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = 0x1b7,   /* OFFCORE_RESPONSE */
        [ C(RESULT_MISS)   ] = 0x1b7,   /* OFFCORE_RESPONSE */
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = 0x1b7,   /* OFFCORE_RESPONSE */
        [ C(RESULT_MISS)   ] = 0x1b7,   /* OFFCORE_RESPONSE */
    },

and extra structure to define additional flags/masks for events like OFFCORE_RESPONSE:

static __initconst const u64 skl_hw_cache_extra_regs 
 [ C(LL  ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
                       SKL_LLC_ACCESS|SKL_ANY_SNOOP,
        [ C(RESULT_MISS)   ] = SKL_DEMAND_READ|
                       SKL_L3_MISS|SKL_ANY_SNOOP|
                       SKL_SUPPLIER_NONE,
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
                       SKL_LLC_ACCESS|SKL_ANY_SNOOP,
        [ C(RESULT_MISS)   ] = SKL_DEMAND_WRITE|
                       SKL_L3_MISS|SKL_ANY_SNOOP|
                       SKL_SUPPLIER_NONE,
 [ C(NODE) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
                       SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
        [ C(RESULT_MISS)   ] = SKL_DEMAND_READ|
                       SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
                       SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
        [ C(RESULT_MISS)   ] = SKL_DEMAND_WRITE|
                       SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,

Upvotes: 4

Related Questions