mgoldwasser
mgoldwasser

Reputation: 15424

Create index on first 3 characters (area code) of phone field?

I have a Postgres table with a phone field stored as varchar(10), but we search on the area code frequently, e.g.:

select * from bus_t where bus_phone like '555%'

I wanted to create an index to facilitate with these searches, but I got an error when trying:

CREATE INDEX bus_ph_3 ON bus_t USING btree (bus_phone::varchar(3));

ERROR: 42601: syntax error at or near "::"

My first question is, how do I accomplish this, but also I am wondering if it makes sense to index on the first X characters of a field or if indexing on the entire field is just as effective.

Upvotes: 3

Views: 4752

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656942

Actually, a plain B-tree index is useless for pattern matching with LIKE (~~) or regexp (~), even with left-anchored patterns, if your installation doesn't run on locale "C" - which is the typical case. Here is an overview over pattern matching and indices:

Create an index with the varchar_pattern_ops operator class (matching your varchar column) and be sure to read the chapter on operator classes in the manual.

CREATE INDEX bus_ph_pattern_ops_idx ON bus_t (bus_phone varchar_pattern_ops);

Your original query can use this index:

... WHERE bus_phone LIKE '555%'

Performance of a functional index on the first 3 characters as described in the answer by @a_horse is pretty much the same in this case.

fiddle
Old sqlfiddle

Generally, a functional index on relevant leading characters would be be a good idea, but your column has only 10 characters. Consider that the overhead per tuple is already 28 bytes. Saving 7 bytes is just not substantial enough to make a big difference. Add the cost for the function call and the fact that xxx_pattern_ops are generally a bit faster.
In Postgres 9.2 or later the index on the full column can also serve as covering index in index-only scans.

However, the more characters in the columns, the bigger the benefit from a functional index.
You may even have to resort to a prefix index (or some other kind of hash) if strings get too long. There is a maximum length for indices. See:

If you decide to go with the functional index, consider using the xxx_pattern_ops variant for a small additional performance benefit. Be sure to read about the pros and cons in the manual and in Peter Eisentraut's blog entry:

CREATE INDEX bus_ph_3 ON bus_t (left(bus_phone, 3) varchar_pattern_ops);

Explanation for error message

You'd have to use the standard SQL cast syntax for functional indices. This would work - pretty much like the one with left(), but like @a_horse I'd prefer left().

CREATE INDEX bus_ph_3 ON bus_t USING btree (cast(bus_phone AS varchar(3));

Upvotes: 4

user330315
user330315

Reputation:

When using like '555%' an index on the complete column will be used just as well. There is no need to only index the first three characters.

If you do want to index only the first 3 characters (e.g. to save space), then you could use the left() funcion:

CREATE INDEX bus_ph_3 ON bus_t USING btree (left(bus_phone,3));

But in order for that index to be used, you would need to use that expression in your where clause:

where left(bus_phone,3) = '555';

But again: that is most probably overkill and the index on the complete column will be good enough and can be used for other queries as well e.g. bus_phone = '555-1234' which the index on just the first three characters would not.

Upvotes: 4

Related Questions