Search performance by column value in PostgreSQL

Question

I have a table with about 4000 rows in it. That table has a column called "type". The query

SELECT DISTINCT type FROM table

shows 14 rows.

In the nearest future the table is going to grow up to about 10 million rows, and I don't know whether the distinct "type" values are going to be left 14 or increase.

Does it make sense if I create another table called "types" and make the "type_id" column in my main table instead of string "type"? In this case I'll have to insert new row in types table every time I meet a new value.

Or is it enough just to add a hash index for my type column then search with

SELECT * FROM table WHERE table.type IN('value1', 'value2', 'value3')

?

Gordon Linoff · Accepted Answer

I vote for a new table. This will ensure that you have an explicit list of types. You can add additional information, such as:

Code
Name
Description
Creation date
Created by

In addition, this will make any query to get the list of types (say for an application) faster. It will also ensure that the types are consistent. This is particularly important if they are strings -- which can be misspelled, have hidden characters, or simply proliferate.

Search performance by column value in PostgreSQL

Answers (1)

Related Questions