KarlsFriend
KarlsFriend

Reputation: 745

Why does adding an index to a table make my query slower?

I have the following table that is very large. (> 1 Million Rows)

create table DataPoint (id int8 not null, 
                        time timestamp not null, 
                        value float8, dataFile_id int8 not null, 
                        type_id int8 not null, primary key (id));

All my other tables are much smaller (<1000) Rows.

I try to increase the performance of the following Query:

select dp.id, dp.value, dp.time 
from DataPoint dp 
inner join DataFile datafile2_ on dp.dataFile_id=datafile2_.id 
inner join DataType datatype4_ on dp.type_id=datatype4_.id 
where dp.dataFile_id=? and dp.type_id=? 
    and dp.value is not null order by dp.time asc limit 1

I already had an index on dp.dataFile_id. The Query took about 500ms. Then I added an index on dp.type_id which decreased the time to about 40ms.

But when I add an additional Index on dp.time, the query suddenly takes 2500ms! Removing the index brought the time back to 30-40ms. I find it very hard to understand how adding an index can decrease the performance of a query.

Upvotes: 0

Views: 97

Answers (2)

jjanes
jjanes

Reputation: 44137

With the time index in place, PostgreSQL thinks it will be fastest to use that index to walk through the rows in time order, and stop at the first row that meets the other conditions.

It seems that it actually has to walk through large portion of the table to find that first row, because the rows that meet your criteria are all in the later time periods, none are in the early time periods. But PostgreSQL doesn't know that, which is way it gets fooled into using the slower index.

As Gordon Linoff answered, you probably will benefit from a composite index.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

Either index could be used for this query. Why one is chosen and not the other depends on various factors, such as the statistics that the database keeps about the table.

However, the best index for this query is a composite index on DataPoint(dataFile_id, type_id, value, time).

Upvotes: 1

Related Questions