kizzx2
kizzx2

Reputation: 19213

How to locate idle time (and network IO time, etc.) in XPerf?

Let's say I have a contrived program:

#include <Windows.h>

void useless_function()
{
    Sleep(5000);
}

void useful_function()
{
    // ... do some work
    useless_function();
    // ... do some more work
}

int main()
{
    useful_function();
    return 0;
}

Objective: I want the profiler to tell me useful_function() is needlessly calling useless_function() which waits for no obvious reasons. Under XPerf, this doesn't show up in any of the graphs I have because the call to WaitForMultipleObjects() seem to be accounted to Idle.exe instead of my own program.

And here's the xperf command line that I currently run:

xperf -on Latency -stackwalk Profile

Any ideas?

(This is not restricted to wait functions. The above might have been solved by placing breakpoints at NtWaitForMultipleObjects. Ideally there could be a way to see the stack sample that's taking up a lot of wall-clock time as opposed to only CPU time)

Upvotes: 3

Views: 964

Answers (3)

Bruce Dawson
Bruce Dawson

Reputation: 3482

Wait Analysis is the way to do this. You should:

  • Record the CSWITCH provider, in order to get all context switches
  • Record call stacks on context switches by adding +CSWITCH to your -stackwalk argument
  • Probably record call stacks on the ready thread to get more information on who readied you (i.e.; who released the Mutex or CS or semaphore and where) by adding +READYTHREAD to your -stackwalk

Then you use CPU Usage (Precise) in WPA (or xperfview, but that's ancient) to look at the context switches and find where your TimeSinceLast is high on a thread that shouldn't be going idle. You'll typically want the columns in CPU Usage (Precise) in this sort of order:

  • NewProcess (your process being switched in)
  • NewThreadId
  • NewThreadStack
  • ReadyingProcess (who made your thread ready to run)
  • ReadyingThreadId (optional)
  • ReadyThreadStack (optional, requires +ReadyThread on -stackwalk)
  • Orange bar
  • Count
  • TimeSinceLast (us) - sort by this column, usually
  • Whatever other columns you want

For details see these particular articles from my blog: - https://randomascii.wordpress.com/2014/08/19/etw-training-videos-available-now/ - https://randomascii.wordpress.com/2012/06/19/wpaxperf-trace-analysis-reimagined/

Upvotes: 1

Thomas Kejser
Thomas Kejser

Reputation: 1294

I think what you are looking for is the Wait analysis with Ready Thread functionality in Xperf. It captures every context switch and gives you the call stack of the thread once it wakes up from sleep (or an otherwise blocked operation). In your case, you would see the stack just after the call sleep(5000) as well as the time spend sleeping.

The functionality is a bit obscure to use. But it is fortunately well described here:

Use Xperf's Wait Analysis for Application-Performance Troubleshooting

Upvotes: 2

Mike Dunlavey
Mike Dunlavey

Reputation: 40679

This "profiler" will tell you - just randomly pause it a few times and look at the stack. If do some work takes 5 seconds, and do some more work takes 5 seconds, then 33% of the time the stack will look like this

main: calling useful_function
useful_function: calling useless_function
useless_function: calling Sleep

So roughly 33% of your stack samples will show exactly that. Any line of code that's costing some fraction of wall-clock time will appear on roughly that fraction of samples.

On the rest of the samples you will see it doing the other things.

There are automated profilers that do the same thing in a more pretty way, such as Zoom and LTProf, although they don't actually show you the samples.

I looked at the xperf doc, trying to figure out if you could get stack samples on wall-clock time and get percents at line-level resolution. It seems you gotta be on Windows 7 or Vista. They only bother with functions, not lines, which if you have realistically big functions, is important. I couldn't figure out how to get access to the individual samples, which I think is important for seeing why the program is spending its time.

Upvotes: 0

Related Questions