Reputation: 745
I have the following table that is very large. (> 1 Million Rows)
create table DataPoint (id int8 not null,
time timestamp not null,
value float8, dataFile_id int8 not null,
type_id int8 not null, primary key (id));
All my other tables are much smaller (<1000) Rows.
I try to increase the performance of the following Query:
select dp.id, dp.value, dp.time
from DataPoint dp
inner join DataFile datafile2_ on dp.dataFile_id=datafile2_.id
inner join DataType datatype4_ on dp.type_id=datatype4_.id
where dp.dataFile_id=? and dp.type_id=?
and dp.value is not null order by dp.time asc limit 1
I already had an index on dp.dataFile_id. The Query took about 500ms. Then I added an index on dp.type_id which decreased the time to about 40ms.
But when I add an additional Index on dp.time, the query suddenly takes 2500ms! Removing the index brought the time back to 30-40ms. I find it very hard to understand how adding an index can decrease the performance of a query.
Upvotes: 0
Views: 97
Reputation: 44137
With the time
index in place, PostgreSQL thinks it will be fastest to use that index to walk through the rows in time
order, and stop at the first row that meets the other conditions.
It seems that it actually has to walk through large portion of the table to find that first row, because the rows that meet your criteria are all in the later time periods, none are in the early time periods. But PostgreSQL doesn't know that, which is way it gets fooled into using the slower index.
As Gordon Linoff answered, you probably will benefit from a composite index.
Upvotes: 0
Reputation: 1269443
Either index could be used for this query. Why one is chosen and not the other depends on various factors, such as the statistics that the database keeps about the table.
However, the best index for this query is a composite index on DataPoint(dataFile_id, type_id, value, time)
.
Upvotes: 1