Reputation: 1598
We have a table with more than 100.000.000 records, it is a slowly changing dimension where for the same durable key, fields ValidFrom (expressed as TaskDate in the screen-shot bellow) and ValidUntil decide on the record's validity at a certain date.
After trying to fix the issue with filtered indices, filtered statistics, hardcoded parameters, dynamic SQL... no progress made on one specific issue: the optimiser selects the right index, does a seek, estimated records are relatively close to the actual records, but only one CPU thread reads something.
It doesn't matter if I run it with MAXDOP 6, or letting it run on all available CPUs.
What else could I try?
First edit:
Some comments were focusing on the data size involved, while this is not the problem here. Attached is the upscoped query which does more than just a record count (used initially to demonstrate a lazy worker thread situation).
Direct consequence of this lazy worker thread on read is the key lookup that does exactly the same - only one thread is receiving records and has something to do.
Execution plan can be seen here: http://pastebin.com/zJwZ56vh
Upvotes: 3
Views: 1321
Reputation: 453298
This is explained here. Parallel Nested Loops Join.
You only have one row feeding into the nested loops join. Parallelism with nested loops joins usually works by distributing the outer rows between multiple threads, not by having multiple threads process an individual outer row.
As there is only one outer row to be processed this means a single thread gets assigned it and ends up doing all the work in that parallel zone (until the repartition streams operator)
Exception
A single outer row and no correlated parameters may in some rare cases yield parallelism on the inner side of a nested loops join
(Source p. 21)
In your case you are not cross joining and the value from Table3
is correlated and shows up in the "outer references" for the nested loops operator so that exception does not apply.
So you need to move the part you want parallelised such that it is no longer on the inside of the nested loops. If there will always be one row coming from Table3
you could just assign the value to a scalar variable in a separate query then use the scalar variable in the main query.
Otherwise you could look at expressing the query in a different way or using query hints to achieve this.
Screenshot of parallel part of plan for reference
Upvotes: 3