Reputation: 1192
Hi i was wondering what the performance overheads are like for single column vs multiple column indexes in terms of inserts. So for example if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns. When i'm talking about performance i'm interested in raw speed.
Upvotes: 10
Views: 7506
Reputation: 52107
if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns.
For operations that modify data, a single index with 3 columns should be faster than 3 indexes with single column, for following reasons:
Consider the following MS SQL Server benchmark:
CREATE TABLE ONE_INDEX (
ID int PRIMARY KEY NONCLUSTERED,
F1 uniqueidentifier NOT NULL,
F2 uniqueidentifier NOT NULL,
F3 uniqueidentifier NOT NULL
);
CREATE INDEX ONE_INDEX_IE1 ON ONE_INDEX (F1, F2, F3);
CREATE TABLE THREE_INDEXES (
ID int PRIMARY KEY NONCLUSTERED,
F1 uniqueidentifier NOT NULL,
F2 uniqueidentifier NOT NULL,
F3 uniqueidentifier NOT NULL
);
CREATE INDEX THREE_INDEXES_IE1 ON THREE_INDEXES (F1);
CREATE INDEX THREE_INDEXES_IE2 ON THREE_INDEXES (F2);
CREATE INDEX THREE_INDEXES_IE3 ON THREE_INDEXES (F3);
GO
SET NOCOUNT ON
DECLARE @t DATETIME;
DECLARE @id INT;
DECLARE @count INT;
SET @count = 100000;
PRINT 'ONE_INDEX:'
SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
INSERT INTO ONE_INDEX VALUES(@id, NEWID(), NEWID(), NEWID());
SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT ' INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
UPDATE ONE_INDEX SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT ' UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
SET @t = CURRENT_TIMESTAMP
DELETE FROM ONE_INDEX;
PRINT ' DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
PRINT 'THREE_INDEXES:'
SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
INSERT INTO THREE_INDEXES VALUES(@id, NEWID(), NEWID(), NEWID());
SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT ' INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
UPDATE THREE_INDEXES SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT ' UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
SET @t = CURRENT_TIMESTAMP
DELETE FROM THREE_INDEXES;
PRINT ' DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
GO
DROP TABLE ONE_INDEX;
DROP TABLE THREE_INDEXES;
GO
Which (on my MS SQL Server Express 2008 R2 under virtual machine) prints:
ONE_INDEX:
INSERT 100000 rows: 4173 ms
UPDATE 100000 rows: 5530 ms
DELETE 100000 rows: 2706 ms
THREE_INDEXES:
INSERT 100000 rows: 6640 ms
UPDATE 100000 rows: 10436 ms
DELETE 100000 rows: 3516 ms
Increasing the @count
to 1000000 results in:
ONE_INDEX:
INSERT 1000000 rows: 40143 ms
UPDATE 1000000 rows: 55796 ms
DELETE 1000000 rows: 95576 ms
THREE_INDEXES:
INSERT 1000000 rows: 61360 ms
UPDATE 1000000 rows: 91766 ms
DELETE 1000000 rows: 99500 ms
Please note that in most workloads reads outnumber writes, so the optimization efforts usually prioritize covering SELECTs with indexes at the expense of INSERTs, UPDATEs and DELETEs. Also, a composite index will influence query performance in a different way compared to 3 separate indexes.
Only you can perform the appropriate tests and reach the right balance between all these concerns.
Upvotes: 4
Reputation: 10013
Keep in mind that a composite index made up of 3 columns like LastName, FirstName and City is not useful when you need to search on City. So in this case you would need to have an index just for City.
Upvotes: 0
Reputation: 30314
I think the only way to answer is to test a few scenarios out on your hardware with your systems usage patterns. Typically index overhead is huge compared to not having indexes. So the more columns in a index the more overhead. But the user perceived impact may seem nil, but profile the different tests and look at the numbers. I've worked at places where they just throw indexes on everything. I don't agree with this approach. I think you should put an index on when you can prove it will be of value. They take up space in addition to adding overhead. Again you will need to test your own setup to answer your question though.
Upvotes: 3
Reputation: 86706
The INSERT overhead of INDEXes is normally negligible regardless of single or multi column.
Unless writes significantly out weigh reads, always include whatever indexes are necessary to improve the performance of your queries.
In some case multiple single column indexes sufficiently boost the performance of many more queries than a single multi-column index. More often, a single multi-column index further boosts performance of a smaller set of queries.
The general rule is to consider the query performance, not the insert performance, unless you can predict or see a particular issue with inserts.
Upvotes: 7