Reputation: 93417
When I entered my current (employer's) company a new database schema was designed and will be the base of a lot of the future tools that are/will be created. With my limited SQL knowledge I think the table is rather well designed. My only concern is that almost every table has a multy-part primary key. Every table has at least a CustomerId and key of it's own. While these are indeed defining for a certain record, I have the feeling that multiple keys (we're talking quadruple here) are very inefficient.
Today I was seeing some unimaginable CPU usage over a simple, repeated query that joins two tables, selects a single string field from the first and distincts them.
select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
on f.id = b.fId
where b.cId = @id;
Checking the execution plan (I'm no EP Hero) I noticed that there are three major CPU points. The distinct (as expected) and two seeks over the indeces. I would personally think that the indices seek should be extremely fast, but they take up 18% of the cost each. Is this normal? Is it due to the (quadruple) clustered indexes?
--UPDATE--
The query is used for creating a Lucene index. It's a one-time processing that happens about weekly (sounds contradictive, I know). I can't re-use any results here as far as I see.
Upvotes: 1
Views: 170
Reputation: 18322
In most databases, indexes aren't used if the first column in the index isn't listed. You say that the customerId is part of every primary key, but you don't use it for the join in your query. To properly answer your question, we really need to see the create table
output for foo and bar, or at least show index from
.
That said, your query may be faster if you change it like so:
select distinct(f.FIELDNAME) as fieldName
from foo f
inner join bar b
on f.id = b.fId
and f.cId = b.cId #Using this part of the key will speed it up
where b.cId = @id;
My comment assumes that your primary key is ordered as "cId, fId" Effectively, that will mean that your query doesn't have to check every cId, only the ones that are part of the index.
Upvotes: 0
Reputation: 425843
Could you please run the following queries and post their output:
SELECT COUNT(*), COUNT(DISTINCT fieldname)
FROM foo
SELECT COUNT(*), COUNT(DISTINCT cId), COUNT(DISTINCT fId)
FROM bar
This will help to estimate which indexes best suit your needs.
Meanwhile make sure you have the following indexes:
foo (FIELDNAME)
bar (cId, fId)
and rewrite your query:
SELECT DISTINCT(fieldname)
FROM foo f
WHERE EXISTS (
SELECT 1
FROM bar b
WHERE b.fId = f.id
AND b.cId = @id
)
This query should use an index on f.FIELDNAME
to build the DISTINCT
list and the index on bar
to filter out the non-existent values.
Upvotes: 3
Reputation: 131202
This kind of query looks familiar. Im guessing here, but, it's probably populating a combo box on a web/winform ui that is being hit pretty hard.
Perhaps you should be caching the results on the application side so you don't end up executing it so often. Worse case scenario you could cache this on sql servers side, but its a massive kludge.
Upvotes: 1