Dan Rubio
Dan Rubio

Reputation: 4907

If my database column has a lot of null values, why shouldn't I use an index on it?

I read an article on tutorials point that says the following:

Indexes should not be used on columns that contain a high number of NULL values.

It doesn't say for what reason though.

Here's an example I thought of. Let's say I have a table called girls and on one of the columns is whether they have a boyfriend or not. Let's say that if they have a boyfriend the value is simply yes and if they don't it would just have a value of NULL.

Could it be that I wouldn't want to put an index on the has_boyfriend column because I could easily search for either/or type with SELECT * FROM girls WHERE has_boyfriend IS NULL or SELECT * FROM girls WHERE has_boyfriend IS NOT NULL, thus not necessarily requiring an index because of the simplicity? It's a contrived example but it's what I thought of.

Upvotes: 1

Views: 3050

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270483

Indexes on columns with lots of NULL values can be useful, particularly if your query is looking at the values that are populated.

For your example, though, you have a different issue. In general, indexes on binary columns are not useful. The purpose of indexes is to reduce the number of data pages that need to be read. In general, binary columns are going to have records with both values on any given data page.

There are two exceptions, but the second doesn't apply to Postgres. The first is if one of the values is so rare that the values occur on few data pages. Then an index can be helpful to find them.

The second is for a clustered index. Because Postgres does not support them, that exception is not relevant.

Upvotes: 4

Related Questions