Leonardo
Leonardo

Reputation: 11391

SQL Server - what kind of index should I create?

I need to make queries such as

SELECT 
    Url, COUNT(*) AS requests, AVG(TS) AS avg_timeSpent 
FROM 
    myTable 
WHERE 
    Url LIKE '%/myController/%'
GROUP BY 
    Url

run as fast as possible.

The columns selected and grouped are almost always the same, being the difference, an extra column on the select and group by (the column tenantId)

What kind of index should I create to help me run this scenario?

Edit 1:
If I change my base query to '/myController/%' (note there's no % at the begging) would it be better?

Upvotes: 0

Views: 67

Answers (3)

Remus Rusanu
Remus Rusanu

Reputation: 294287

Columnstore indexes can be quite fast at such tasks (aggregates on globals scans). But even they will have trouble handling a LIKE '%/mycontroler/%' predicate. I recommend you parse the URL once into an additional computed field that projects the extracted controller of your URL. But the truth is that looking at global time spent on a response URL reveals very little information. It will contain data since the beginning of time, long since obsolete by newer deployments, and not be able to capture recent trends. A filter based on time, say per hour or per day, now that is a very useful analysis. And such a filter can be excellently served by a columnstore, because of natural time order and segment elimination.

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 94914

This is a query that cannot be sped up with an index. The DBMS cannot know beforehand how many records will match the condition. It may be 100% or 0.001%. There is no clue for the DBMS to guess this. And access via an index only makes sense when a small percentage of rows gets selected.

Moreover, how can such an index be structured and useful? Think of a telephone book and you want to find all names that contain 'a' or 'rs' or 'ems' or whatever. How would you order the names in the book to find all these and all other thinkable letter combinations quickly? It simply cannot be done.

So the DBMS will read the whole table record for record, no matter whether you provide an index or not.

There may be one exception: With an index on URL and TS, you'd have both columns in the index. So the DBMS might decide to read the whole index rather than the whole table then. This may make sense for instance when the table has hundreds of columns or when the table is very fragmented or whatever. I don't know. A table is usually much easier to read sequentially than an index. You can still just try, of course. It doesn't really hurt to create an index. Either the DBMS uses it or not for a query.

Upvotes: 1

Rahul
Rahul

Reputation: 77876

Based on your posted query you should have a index on Url column. In general columns which are involved in WHERE , HAVING, ORDER BY and JOIN ON condition should be indexed.

You should get the generated query plan for the said query and see where it's taking more time. Again based n the datatype of the Url column you may consider having a FULLTEXT index on that column

Upvotes: 0

Related Questions