Creating the optimum index for my database

Question

I have a table in postgresql with the following information:

rawData (fileID integer references otherTable, lineNum integer, data1 double, ...)

When I am searching this table, I do so with the following query:

SELECT lineNum, data1, ...other data FROM rawData WHERE 
fileID = ? AND data1 < ? ORDER BY lineNum;

In general, the data in this table is a number of entries for each fileID, and each fileID has lineNum from 0 to x, with lineNum never repeating for each fileID (but it does repeat for different fileID's). Then data1 is effectively a random number that may or may not overlap.

In order to speed up the reading of this data, I am trying to create an index on it, but am having trouble figuring out the best way to index it. Currently I am looking at one of the following two index methods, and am wondering which would be better for my search, or if there is another option that I haven't thought of that would be better than either of them.

index idea 1:

CREATE INDEX searchIndex ON rawData (fileID, data1, lineNum);

index idea 2:

CREATE INDEX searchIndex ON rawData (fileID, lineNum, data1);

Note that at this time, this and a search not constrained by data1 are the only searches that I run on this table, so I'm not too concerned about this index slowing down other searches.

Lastly, would I have to change my search query to use the index, or would it automatically use that index when I search the table?

Robins Tharakan · Accepted Answer

You should look at using this instead:

CREATE INDEX searchIndex ON rawData (fileID, lineNum);

A few things:

In particular, as per docs, Indexes with more than three columns are unlikely to be helpful unless the usage of the table is extremely stylized.
Since your second search query requires filtering without the data1 column, keeping the second column lineNum should be sufficient (since you mention it would be quasi-random), and in the rare occurrence that there are repeats, table fetches should ensure correctness. But what this would mean is that the Index would be 1/3rd smaller in size, which is a big win (Think index small-enough to be in memory / index-only-scans etc.)

Creating the optimum index for my database

Answers (2)

Related Questions