Reputation: 629
The query below to get the distinct zip codes from the Address
table takes roughly 4 mins and 42 seconds. There are 1,006,699 records in the Address
table. The composite key for the table is Address1, Address2, City, ZipCode
.
There have been times when the query takes 5 seconds to run or even 1 mill seconds.
How do I improve the performance of the query?
Here is the SQL query:
SELECT DISTINCT ZipCode FROM Address
Here is the schema for the table:
CREATE TABLE [dbo].[Address]
(
[AddressID] [INT] IDENTITY(1,1) NOT NULL,
[Address1] [NVARCHAR](1000) NOT NULL,
[Address2] [NVARCHAR](1000) NOT NULL,
[City] [NVARCHAR](1000) NOT NULL,
[StateCd] [NVARCHAR](2) NULL,
[ZipCode] [NVARCHAR](10) NOT NULL,
PRIMARY KEY CLUSTERED
([Address1] ASC, [Address2] ASC, [City] ASC, [ZipCode] ASC)
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Address] ADD DEFAULT ('') FOR [Address2]
GO
I can't seem to add an image of the execution plan.
Upvotes: 0
Views: 121
Reputation: 996
1.If possible convert ZipCode
datatype from nvarchar
to bigint
.
2.try group by ZipCode
SELECT ZipCode FROM Address GROUP BY ZipCode;
Upvotes: 0
Reputation: 46203
For maximum performance, you could create an indexed view so that the aggregation is materialized:
CREATE VIEW vw_Address_ZipCode
WITH SCHEMABINDING
AS
SELECT ZipCode, COUNT_BIG(*) AS ZipCodeCount
FROM dbo.Address
GROUP BY ZipCode;
GO
CREATE UNIQUE CLUSTERED INDEX cdx ON dbo.vw_Address_ZipCode(ZipCode);
GO
If you are using Enterprise Edition, the optimizer can consider the indexed view without referencing the view directly:
SELECT DISTINCT ZipCode FROM Address;
In lesser editions, you'll need to query the view and add a NOEXPAND
query hint so the index is considered for optimization:
SELECT DISTINCT ZipCode FROM dbo.vw_Address_ZipCode WITH(NOEXPAND);
See the documentation for indexed view requirements.
Upvotes: 0
Reputation: 1269823
For this query:
SELECT DISTINCT ZipCode FROM Address
You want an index on ZipCode
or at least where ZipCode
is the first column:
create index idx_address_zipcode on address(zipcode);
The resulting execution plan should be a scan of the index, which is much faster than processing the original table (and aggregating to get the distinct zip codes).
You could also change your existing index to (zipcode, city, address1, address2)
. This makes the index more useful (in my opinion), because zipcode
is more likely to be used for filtering than address1
. However, that index will be larger than one only on zipcode
.
Upvotes: 2