Vince Ashby-Smith
Vince Ashby-Smith

Reputation: 1192

Performance overhead of single column vs multiple column indexes

Hi i was wondering what the performance overheads are like for single column vs multiple column indexes in terms of inserts. So for example if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns. When i'm talking about performance i'm interested in raw speed.

Upvotes: 10

Views: 7506

Answers (4)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

if i have 3 single column indexes would that be better for inserts into that table as opposed to having 1 index with multiple columns.

For operations that modify data, a single index with 3 columns should be faster than 3 indexes with single column, for following reasons:

  • Updating an index requires searching through the B-Tree for the correct place to perform the modification. Performing search in one index (even when it is composite) tends to be faster than performing 3 searches in 3 indexes.
  • Each B-Tree leaf contains a row "pointer". 1 index will have 3 times less row pointers than 3 indexes (disregarding the effects of NULL, which is typically not indexed). And smaller is often faster due effects of caching.
  • 1 index may tend to have less splitting and coalescing of the B-Tree nodes compared to 3 indexes.

Consider the following MS SQL Server benchmark:

CREATE TABLE ONE_INDEX (
    ID int PRIMARY KEY NONCLUSTERED,
    F1 uniqueidentifier NOT NULL,
    F2 uniqueidentifier NOT NULL,
    F3 uniqueidentifier NOT NULL
);
CREATE INDEX ONE_INDEX_IE1 ON ONE_INDEX (F1, F2, F3);

CREATE TABLE THREE_INDEXES (
    ID int PRIMARY KEY NONCLUSTERED,
    F1 uniqueidentifier NOT NULL,
    F2 uniqueidentifier NOT NULL,
    F3 uniqueidentifier NOT NULL
);
CREATE INDEX THREE_INDEXES_IE1 ON THREE_INDEXES (F1);
CREATE INDEX THREE_INDEXES_IE2 ON THREE_INDEXES (F2);
CREATE INDEX THREE_INDEXES_IE3 ON THREE_INDEXES (F3);
GO



SET NOCOUNT ON
DECLARE @t DATETIME;
DECLARE @id INT;
DECLARE @count INT;
SET @count = 100000;



PRINT 'ONE_INDEX:'

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    INSERT INTO ONE_INDEX VALUES(@id, NEWID(), NEWID(), NEWID());
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    UPDATE ONE_INDEX SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
DELETE FROM ONE_INDEX;
PRINT '    DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';



PRINT 'THREE_INDEXES:'

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    INSERT INTO THREE_INDEXES VALUES(@id, NEWID(), NEWID(), NEWID());
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    INSERT ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
SET @id = 0;
BEGIN TRANSACTION;
WHILE @id < @count BEGIN
    UPDATE THREE_INDEXES SET F1 = NEWID(), F2 = NEWID(), F3 = NEWID() WHERE ID = @id
    SET @id = @id + 1;
END
COMMIT TRANSACTION;
PRINT '    UPDATE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';

SET @t = CURRENT_TIMESTAMP
DELETE FROM THREE_INDEXES;
PRINT '    DELETE ' + CAST(@count AS VARCHAR) + ' rows: ' + CAST(DATEDIFF(ms, @t, CURRENT_TIMESTAMP) AS VARCHAR) + ' ms';
GO



DROP TABLE ONE_INDEX;
DROP TABLE THREE_INDEXES;
GO

Which (on my MS SQL Server Express 2008 R2 under virtual machine) prints:

ONE_INDEX:
    INSERT 100000 rows: 4173 ms
    UPDATE 100000 rows: 5530 ms
    DELETE 100000 rows: 2706 ms
THREE_INDEXES:
    INSERT 100000 rows: 6640 ms
    UPDATE 100000 rows: 10436 ms
    DELETE 100000 rows: 3516 ms

Increasing the @count to 1000000 results in:

ONE_INDEX:
    INSERT 1000000 rows: 40143 ms
    UPDATE 1000000 rows: 55796 ms
    DELETE 1000000 rows: 95576 ms
THREE_INDEXES:
    INSERT 1000000 rows: 61360 ms
    UPDATE 1000000 rows: 91766 ms
    DELETE 1000000 rows: 99500 ms

Please note that in most workloads reads outnumber writes, so the optimization efforts usually prioritize covering SELECTs with indexes at the expense of INSERTs, UPDATEs and DELETEs. Also, a composite index will influence query performance in a different way compared to 3 separate indexes.

Only you can perform the appropriate tests and reach the right balance between all these concerns.

Upvotes: 4

JBrooks
JBrooks

Reputation: 10013

Keep in mind that a composite index made up of 3 columns like LastName, FirstName and City is not useful when you need to search on City. So in this case you would need to have an index just for City.

Upvotes: 0

Kuberchaun
Kuberchaun

Reputation: 30314

I think the only way to answer is to test a few scenarios out on your hardware with your systems usage patterns. Typically index overhead is huge compared to not having indexes. So the more columns in a index the more overhead. But the user perceived impact may seem nil, but profile the different tests and look at the numbers. I've worked at places where they just throw indexes on everything. I don't agree with this approach. I think you should put an index on when you can prove it will be of value. They take up space in addition to adding overhead. Again you will need to test your own setup to answer your question though.

Upvotes: 3

MatBailie
MatBailie

Reputation: 86706

The INSERT overhead of INDEXes is normally negligible regardless of single or multi column.

Unless writes significantly out weigh reads, always include whatever indexes are necessary to improve the performance of your queries.

In some case multiple single column indexes sufficiently boost the performance of many more queries than a single multi-column index. More often, a single multi-column index further boosts performance of a smaller set of queries.

The general rule is to consider the query performance, not the insert performance, unless you can predict or see a particular issue with inserts.

Upvotes: 7

Related Questions