SQL Server threads - one does all the work, the other MAXDOP - 1 (5..30) do nothing

Question

We have a table with more than 100.000.000 records, it is a slowly changing dimension where for the same durable key, fields ValidFrom (expressed as TaskDate in the screen-shot bellow) and ValidUntil decide on the record's validity at a certain date.

After trying to fix the issue with filtered indices, filtered statistics, hardcoded parameters, dynamic SQL... no progress made on one specific issue: the optimiser selects the right index, does a seek, estimated records are relatively close to the actual records, but only one CPU thread reads something.

Tricky Read - Skewed Workload

It doesn't matter if I run it with MAXDOP 6, or letting it run on all available CPUs.

What else could I try?

First edit:

Some comments were focusing on the data size involved, while this is not the problem here. Attached is the upscoped query which does more than just a record count (used initially to demonstrate a lazy worker thread situation).

Direct consequence of this lazy worker thread on read is the key lookup that does exactly the same - only one thread is receiving records and has something to do.

Execution plan can be seen here: http://pastebin.com/zJwZ56vh

30 lazy threads while one is working

Martin Smith · Accepted Answer

This is explained here. Parallel Nested Loops Join.

You only have one row feeding into the nested loops join. Parallelism with nested loops joins usually works by distributing the outer rows between multiple threads, not by having multiple threads process an individual outer row.

As there is only one outer row to be processed this means a single thread gets assigned it and ends up doing all the work in that parallel zone (until the repartition streams operator)

Exception

A single outer row and no correlated parameters may in some rare cases yield parallelism on the inner side of a nested loops join

(Source p. 21)

In your case you are not cross joining and the value from Table3 is correlated and shows up in the "outer references" for the nested loops operator so that exception does not apply.

So you need to move the part you want parallelised such that it is no longer on the inside of the nested loops. If there will always be one row coming from Table3 you could just assign the value to a scalar variable in a separate query then use the scalar variable in the main query.

Otherwise you could look at expressing the query in a different way or using query hints to achieve this.

Screenshot of parallel part of plan for reference

Plan

SQL Server threads - one does all the work, the other MAXDOP - 1 (5..30) do nothing

Answers (1)

Related Questions