Real Master
Real Master

Reputation: 347

improve query performance on SQL Server table contain 3.5 Million rows and growing

I have written one application in C# which is connected to sql server database express edition, from front end I populate the particular table in database every few second and insert approx 200~300 Rows in this table.

Currently table contains approx 3.5 Million rows and its keep growing, the table definition is as below

[DEVICE_ID] [decimal](19, 5) NULL,
[METER_ID] [decimal](19, 5)  NULL,
[DATE_TIME] [decimal](19, 5)  NULL,
[COL1] [decimal](19, 5)  NULL,
[COL2] [decimal](19, 5)  NULL,
.
.
.
.
[COL25] [decimal](19, 5) NULL

I have created non clustered index on Date_Time column, and to note there is no unique column exists if it requires I can create identity column (Auto increment) to this but my report generation logic is totally based on Date_Time column.

I usually fire the query based on time, I.e. if I need to calculate the variation occurred in the col1 in the month period. I will need the value of Col1 on first value of 1st day and last value of last day of month, like wise i need to fire the query for flexible dates and I usually need only opening value and closing value based on Date_Time column for any chosen column.

To get first value of col1 for the first day, the query is

select top (1) COL1 from VALUEDATA where DeviceId=@DId and MeterId =@MId and Date_Time between @StartDateTime and @EndDateTime order by Date_Time

To get last value of col1 for the last day, the query is

select top (1) COL1 from VALUEDATA where DeviceId=@DId and MeterId =@MId and Date_Time between @StartDateTime and @EndDateTime order by Date_Time desc

But when I fire the above queries its takes approx 20~30 seconds, I believe this can be further optimized but don't know the way ahead.

One thought i given to this is to create another table and insert first and last row on every day basis and fetch data from this. But I will avoid the same if I can do something in existing table and query.

It’s greatly appreciable if someone can provide the inputs for the same.

Upvotes: 1

Views: 496

Answers (2)

Rogier Werschkull
Rogier Werschkull

Reputation: 744

I have another suggestion: if your goal is to get the values of COL1, COL2 etc after you do the index lookup, the solution with just a nonclustered index on the filtering columns still has to join back to the main table, ie; do a bookmark / RID lookup.

Your info gives me the impression your base table does is not clustered (has no clustered index); is in fact a heap table

If most of your querys on the table follow the pattern you describe, I would make this table clustered. In contrary what most people think, you do not have to define an clustered index as the (unique) primary key. If you define a clustered index in SQL server on non unique data, SQL server will make it unique 'under water' by adding an invisible row identifier...

If the main, most often USED selection / filter criteria on this table is date time, I would change the table to the following clustered structure:

  1. First, remove all non clustered indexes
  2. Then add the following clustered index:

CREATE CLUSTERED INDEX clix_valuedata ON VALUEDATA (Date_Time, DeviceId, MeterId);

When using query's that follow your pattern, you (probably!) will get very performant Clustered index SEEK style access to your table if you look at the query explain plan.. You will now get all the other columns in the table for free, as bookmark lookups are not needed anymore. This approach will probably scale better too as the table grows; because of the SEEK behaviour...

Upvotes: 1

Marc Guillot
Marc Guillot

Reputation: 6455

To fully optimize those queries you need two different multiple indexes :

CREATE INDEX ix_valuedata_asc ON VALUEDATA (DeviceId, MeterId, Date_Time);
CREATE INDEX ix_valuedata_des ON VALUEDATA (DeviceId, MeterId, Date_Time DESC);

Upvotes: 1

Related Questions