Reputation: 35557
I have a table tb_FirstName
with one field FirstName
. The table has 100 million non null records with lots of repetitions e.g. John occurs 2 million times. The distinct count of FirstName
is over 2 million.
How do I select 1000 distinct names as quickly as possible using standard sql?
I'm currently using the following but this is
Maybe not as efficient as it could be.
SELECT x.FirstName
FROM (
SELECT FirstName,
rnk = RANK() OVER (ORDER BY Firstname)
FROM WHData.dbo.tb_DimUserAccount A
GROUP BY FirstName
) x
WHERE rnk <=1000
Upvotes: 7
Views: 18696
Reputation: 16904
Option with GROUP BY clause
SELECT TOP 1000 FirstName
FROM WHData.dbo.tb_DimUserAccount
GROUP BY FirstName
ORDER BY FirstName
Upvotes: 1
Reputation: 411
Try this
SELECT TOP 1000 FirstName FROM
(SELECT
ROW_NUMBER() OVER(PARTITION BY FirstName ORDER BY FirstName) NO,
FirstName FROM WHData.dbo.tb_DimUserAccount )
AS T1 WHERE no =1
or
SELECT DISINCT TOP 1000 FirstName
FROM WHData.dbo.tb_DimUserAccount ORDER BY FirstName
Upvotes: 3
Reputation: 20745
You need the data after sorting the results on FirstName fields.
It requires full table scan if Index is not created. If Index is created on FirstName then Unique Index scan can improve the time.
Upvotes: 2
Reputation: 62831
Seems like you could use TOP 1000
with DISTINCT
:
SELECT DISINCT TOP 1000 FirstName
FROM WHData.dbo.tb_DimUserAccount
ORDER BY FirstName
Upvotes: 12
Reputation: 399
Make sure you have an index defined on FirstName.
SELECT TOP 1000 FirstName
FROM (SELECT DISTINCT FirstName
FROM dbo.tb_DimUserAccount) N
ORDER BY FirstName
Upvotes: 2