Farhad
Farhad

Reputation: 516

Intel PIN: How do I see speculative instructions?

I'm writing a PIN tool where I want to see speculatively executed instructions that were eventually squashed.

I.e. if a branch direction was predicted, some instructions were executed speculatively, the branch direction was resolved and the prediction was shown to be incorrect, the instructions that were executed would then be squashed and the register file would be restored.

I assume that RTN_AddInstrumentFunction only adds an instrument function to instructions that were retired (i.e. non-speculative or speculative and shown to be correct). Is there a way for me to use PIN to get access to instructions that were executed speculatively but then squashed?

Upvotes: 0

Views: 726

Answers (2)

BeeOnRope
BeeOnRope

Reputation: 65046

You can't do that with PIN and Peter has already covered the details well.

You could, however, do it with a simulation tool such as gem5. Gem5, in particular, supports both simulating x86, and reporting speculative instructions. Of course, the results you'll get are simulated, so the accuracy wrt real hardware will only be as good as the simulation itself.

A hybrid hardware/simulation approach would be to record the actual application using Intel Processor Trace, which includes information about mispredicted branches. Then run, your process again in the simulator, but refer to the metadata about mispredicted branches to hint to the simulator which branches are mispredicted.

This only works (almost) exactly for direct or conditional branches1, which have only 1 or 2 options, so the direction a mispredict takes is evident. For indirect jumps with more than two targets, you'll have to guess what target was mispredicted.


1 In fact, you can also get mispredictions to arbitrary addresses for direct and conditional branches when there are collisions in the predictors.

Upvotes: 1

Peter Cordes
Peter Cordes

Reputation: 365832

You can't do that with binary instrumentation tools like PIN, only with hardware performance counters.

PIN can only see instructions along the correct path of execution; it works by adding / modifying instructions in memory to run extra code. But this new code is still just x86 machine code that the CPU has to execute, giving the illusion of running each instruction one at a time, in program order.

Mis-speculated instructions have no architectural effect so only stuff with special access to the micro-architectural state (like performance counters) can tell you anything about them.


There are perf counters for mispredicts, like perf stat -e branch-misses to count number of branches that were mis-predicted.

Number of bad uops issued by the front-end in the shadow of a mis-speculation that have to be cancelled can be derived (on Skylake and probably other Intel) from
uops_issued.any - uops_retired.retire_slots. Both count fused-domain uops and match each other ~exactly when there's no mis-speculation of any kind (branches, memory-order mis-speculation pipelien nukes, or whatever else).

Upvotes: 3

Related Questions