Daniel Sharp
Daniel Sharp

Reputation: 191

Adding WHERE adds 20+ seconds to SQL Azure query

I'm looking for some advice on speeding up queries in SQL Azure. This is an example of the two queries we're running, when we add a WHERE clause on there, the queries grind to a halt.

Both columns, theTime and orderType are indexed. Can anyone suggest how to make these run faster, or things to do to the query to make it more efficient?

5.2 seconds:

sum(cast(theTime AS INT)) as totalTime from Orders

20.2 seconds:

 sum(cast(theTime AS INT)) as totalTime from Orders WHERE orderType='something_in_here'

Here's the relevant information:

CREATE TABLE [dbo].[Orders] (
[ID] int IDENTITY(1,1) NOT NULL,
[orderType] nvarchar(90) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[orderTime] nvarchar(90) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

CONSTRAINT [PrimaryKey_fe2bdbea-c65a-0b85-1de9-87324cc29bff] PRIMARY KEY CLUSTERED ([ID]) 
WITH (IGNORE_DUP_KEY = OFF)
)
GO
CREATE NONCLUSTERED INDEX [orderTime]
ON [dbo].[Orders] ([orderTime] ASC)
WITH (IGNORE_DUP_KEY = OFF,
STATISTICS_NORECOMPUTE = OFF,
ONLINE = OFF)
GO
CREATE NONCLUSTERED INDEX [actiontime_int]


CREATE NONCLUSTERED INDEX [orderType]
ON [dbo].[Orders] ([orderType] ASC)
WITH (IGNORE_DUP_KEY = OFF,
STATISTICS_NORECOMPUTE = OFF,
ONLINE = OFF)
GO

Upvotes: 1

Views: 228

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

I suspect your query is not doing what you think. It is taking the first million counts, rather than the count of the first million rows. I think you want:

select sum(cast(theTime AS INT))
from (select top (1000000) Orders
      from Orders
     ) t

versus:

select sum(cast(theTime AS INT))
from (select top  (1000000) Orders
      from Orders
      WHERE orderType='something_in_here'
     ) t

My suspicion is that using the index actually slows things down, depending on the selectivity of the where clause.

In the original query, you are reading all the data, sequentially. This is fast, because the pages just cycle through the processor.

Going through the index slows things down, because the pages are not read in order. You may be still be reading all the pages (if every page has a matching row), but they are no longer being read in "physical" or "logical" order. They are being read in the order of the index -- which is likely to be random.

Upvotes: 1

Related Questions