Reputation: 402
I want to figure out the performance difference between two Java applications.
Both are tree-searching algorithms, but the faster one requires more pointer dereferences with less footprint (by static analysis).
I guess, the better performance derives from the better locality - higher CPU cache hit rate. To verify my conjecture I tried to profile CPU cache performance with Intel VTune, however I found it is unsupported in the latest Intel VTune Profiler v2024.3.0:
Memory Access analysis is not supported inside a virtual machine since uncore events cannot be collected. For full functionality, consider using a bare-metal environment.
So, my question is: how can I profile the CPU cache performance for a Java app?
Or, is it silly to care about such low-level metrics for Java?
Really appreciate any suggestion or advice.
Upvotes: 1
Views: 91