Reputation: 3334
We have a 16 processor SQL Server 2005 cluster. When looking at CPU usage data we see that most of the time only 4 of the 16 processors are ever utilized. However, in periods of high load, occasionally a 5th and 6th processor will be used, although never anywhere near the utilization of the other 4. I'm concerned that in periods of tremendously high load that not all of the other processors will be utilized and we will have performance degradation.
Is what we're seeing standard SQL Server 2005 cluster behavior? I assumed that all 16 processors would be utilized at all times, though this does not appear to be the case. Is this something we can tune? Or is this expected behavior? Will SQL server be able to utilize all 16 processors if it comes to that?
Upvotes: 3
Views: 4576
Reputation: 294307
I'll consider you did due diligence and validated that the CPU consumption belongs to the sqlservr.exe process, so we're not chasing a red herring here. If not, please make sure the CPU is consumed by sqlservr.exe by checking the Process\% Processor performance counters.
You need to understand the SQL Server CPU scheduling model, as described in Thread and Task Architecture. SQL Server spreads requests (sys.dm_exec_requests) across schedulers (sys.dm_os_schedulers) by assigning each requests to task (sys.dm_os_tasks) that is run by a worker (sys.dm_os_workers). A worker is backed by an OS thread or fiber (sys.dm_os_threads). Most requests (a batch sent to SQL Server) spawn only one task, some requests though may spawn multiple tasks (parallel queries being the most notorious).
The normal behavior of SQL Server 2005 scheduling should be to distribute the tasks evenly, across all schedulers. Each scheduler corresponds to one CPU core. The result should be an even load on all CPU cores. But I've seen the problem you describe a few times in the labs, when the physical workload would distribute unevenly across only few CPUs. You have to understand that SQL Server does not control the thread affinity of its workers, but instead relies on the OS affinity algorithm for thread locality. What that means is that even if SQL Server spreads the requests across the 16 schedulers, the OS might decide to run the threads on only 4 cores. In correlation with this issue there are two problems that may cause or aggravate this behavior:
Also make sure your SQL 2005 is at least at SP2 level, prefferably at latest SP and all CU applied. Same goes for Windows (do you run Windows 2003 or Windows 2008?).
In theory the behavior could also be explained by a very peculiar workload, ie. SQL sees only few very long and CPU demanding requests that have no parallle option. But that would be an extremly skewed load and I never seen something like that in real life.
Upvotes: 3
Reputation: 27294
Even accounting for IO bottleneck I would check is whether you have processor affinities set up, what your maxdop setting is, whether it is SMP or NUMA which should also affect what maxdop you may wish to set.
when you say you have a 16 processor cluster, you mean 2 SQL servers in a cluster with 16 processors each, or 2 x 8 way SQL servers?
Upvotes: 2
Reputation: 415860
Hard to be sure without hard data, but I suspect the problem is that you're more IO-bound or memory-bound than CPU-bound right now, and 4 processors is enough to keep up with your real bottleneck.
My reasoning is that if there were some configuration problem that was keeping you limited to 4 cpus, you wouldn't see it spill over to the 5th and 6th processors at all.
Upvotes: 1
Reputation: 120480
Are you sure that you're not bottlenecking elsewhere? On IO perhaps?
Upvotes: 1