Reputation: 1667
I have a table which has several hundred million rows of data. There is a field called EventId
which is an integer field.
I have a number of different views which will return only data with a specific EventId
If I run a query
SELECT TOP 1000 * FROM vw_MyView
It takes 5 mins to return the rows. What do I need to add indexes to? There are currently indexes on the master table for the primary key (clustered on LogId
) and a non-clustered on EventId
as that is being used in the where clause of each view.
I know views can be indexed - what field(s) should I be indexing on in the view? Should I just run the DB Engine Tuning Wizard and see what it says?
UPDATE following feedback
My master table with all the data inside is along the lines of the following schema
LogId (int) PK
EventId (int)
Param1 varchar(255)
Param2 varchar(255)
..
..
..
Param24 varchar(255)
Each event type has different parameters hence the generic field names in the master table.
I have a view for each type of event, where the ParamX fields in the master table are given a proper field name through the view.
So the view for one event would be something like
SELECT LogId, Param1 AS Name, Param2 AS Address1, Param3 AS Address2
WHERE EventId = 10
I tried the query
SELECT TOP 1000 LogId from vw_MyView
and that worked fast. It's the other fields which are slowing it down, which I assume is down to poor indexing?
UPDATE 2 - More info
Previously the data for each event was stored in a table for each event. This meant that adding new events would require a new table for each one.
I am bulk importing data into a temp table then moving that over into my master table. The bulk import makes it fast, but I am concerned that having such a large master table could mean that querying it becomes so slow that it is unusable.
The millions of rows of data is over 10 years or so and so I could probably move the first 8 years of data over to another database for archive purposes so that just the latest 2 years are kept.
So the question is do I continue with the master table approach which would require no maintenance but potentially a lot of indexing, or go back to the original approach of having a table per event?
Thanks for the feedback guys, really appreciate it
Upvotes: 4
Views: 5232
Reputation: 36126
You shouldn't create indexed views on tables that have "several hundred million rows of data".
A regular view is just a SQL statement stored on your database , so if you run select * from your_view
, sql server just change your_view
for its select.
If you want to index a view, you have to persist the data on the database, which means to make a physical copy of the indexed columns on the DB. As you can imagine, it will drastically increase storage size if you have that amount of rows, specially if you are joining that big table with other tables
Upvotes: 0
Reputation: 280262
Typically views contain more columns than necessary to satisfy many different queries, where those queries would probably benefit better from targeting only the relevant tables and columns directly. (I blogged about this in 2010.) In your case I don't think indexing the view is going to help you... the queries you're writing would just scan that index instead of the clustered index on the base table (and since you aren't doing aggregations, it's going to be the same number of rows, but probably fewer pages). I think the better solution is optimizing the indexes on the underlying table.
An index I might suggest that should dramatically help the query you showed in your view for event 10:
CREATE NONCLUSTERED INDEX Index_ForViewOnEvent10
ON dbo.MasterTable(LogID)
INCLUDE(Param1, Param2, Param3)
WHERE EventID = 10;
Since this index covers the query, and since the filter dramatically reduces the number of rows that have to be scanned, this should be a noticeable improvement.
You may experiment with this with / without the INCLUDE portion. It may be efficient enough to pull those from the clustered index. But I think the above would be the best option.
Now, don't go ballistic here - as I explain below, you don't always want to throw 100 indexes on a table to perfectly satisfy 100 different queries - there is balance to be found between satisfying the queries and not making turning your writes into molasses.
I'll share a few words on the Database Engine Tuning Advisor (DTA).
This wizard is not magic. There is no harm in running it and seeing what it says, but you should not just take all of its recommendations and implement them. I suggest a couple of things when using this tool:
DBSophic has much better tools than the index tuning wizard - they take entire workloads into consideration and don't offer redundant indexes. Qure Analyzer is free and offers a subset of the features available in Qure Optimizer.
Upvotes: 3
Reputation: 171371
I think you are confusing Indexed Views with Views. Indexed Views can (must) have indexes of course, but that does not mean you must use an indexed view in order for SQL Server to use indexes. It will use any indexes that exist on the underlying tables in regular views if it deems that using the index will be faster.
So, start by examining the SQL for your view, and determine what indexes may help that query perform faster. If you post the SQL for the view, we may able to suggest some ideas.
Also, your select statement does not contain an ORDER BY clause, so it is a non-deterministic query. In other words, you may get different results on subsequent runs, even if the data has not changed.
With regards to the EventId
column, you can try adding an index to it, but it may not help. It depends on the cardinality of the data in that column.
Upvotes: 2