Reputation: 570
I have a table with more than a billion rows, with one of the column definitions being:
row_id int NOT NULL
And an index on it
CREATE NONCLUSTERED INDEX row_id_IX ON dbo.test_table ( row_id ASC )
I want to find the minimum value of row_id
in the table. So I write
select min(row_id) from test_table
Given this index is a B+ tree in SQL Server, and the values are NOT-nullable, I expect the query plan to recursively seek towards left child of the root node of B+ tree, until it finds the minimum value. But instead, the planner decides to do an Index scan and ends up reader a billion values followed by an aggregation to find the min value.
Why is that?
row_id_IX
is a partitioned index.
Upvotes: 1
Views: 117
Reputation: 453736
A solution to this is to get the MIN
per partition (which should just require reading the single row at the end of the index from each) and then get the MIN
of those.
One query approach for this (covered by Itzik Ben Gan here) is
SELECT MIN(m.row_id) AS row_id
FROM sys.partitions AS P
CROSS APPLY (SELECT MIN(row_id) AS row_id
FROM dbo.test_table
WHERE $PARTITION.YourPartitionFunction(YourPartitionColumn) = P.partition_number) AS m
WHERE P.object_id = OBJECT_ID('dbo.test_table')
AND P.index_id <= 1;
You haven't told us the name of your partition function or partitioning column so replace YourPartitionColumn
and YourPartitionFunction
as appropriate above (and remember to replace both references to test_table
when applying it against a different table name).
The execution plan for this should have the partition numbers returned from sysrowsets
on the outside of a nested loops and on the inside of this an ordered index scan with a seek predicate to seek into just the single partition of interest. This should be under a TOP 1
operator to stop reading from the scan after the first row is read.
If you don't see the above plan (e.g. as the column you are aggregating is nullable) a more verbose alternative is
SELECT MIN(m.row_id) AS row_id
FROM sys.partitions AS P
CROSS APPLY (SELECT TOP 1 row_id
FROM dbo.test_table
WHERE $PARTITION.YourPartitionFunction(YourPartitionColumn) = P.partition_number
AND row_id IS NOT NULL
ORDER BY row_id
) AS m
WHERE P.object_id = OBJECT_ID('dbo.test_table')
AND P.index_id <= 1;
The more verbose alternative does mean that you will potentially miss out on the message:
Warning: Null value is eliminated by an aggregate or other SET operation.
but I doubt anyone cares about seeing that.
Upvotes: 6