regex sql optimization

Question

I have a system (sysB) that stores names without any rules on the names and I have another system that has objects with ids (sysA). When data comes from sysA to sysB, the names that are put in there will be like "BIN-the id-and some other stuff".

On sysB, I get a list of ids from sysA then perform a regex query like this:

select id
from mytable
where "mytable"."name"::text ~ '(^BIN-8178-.*)|(^BIN-2345-.*)';

I've noticed that the query becomes so slow as the list grows bigger.

If the list of ids has 800 ids, the query takes 18 seconds!

I tried changing the field to be indexed; it did not help. I know that adding this index to the other system as a new field can speed it up but I was wondering if there is a way to speed up the query without changing the schema. Also note that the systems are independent so I cannot do join.

Gordon Linoff · Accepted Answer

Try this query:

select id
from mytable
where "mytable"."name" like 'BIN-8178%'

with an index on mytable(name). Postgres should be able to use the index for this LIKE pattern.

If the index is being used, it should also be used for:

select id
from mytable
where "mytable"."name" like 'BIN-8178%' or
      "mytable"."name" like 'BIN-2345%';

However, indexes can be finicky and you might need to use union all:

select id
from mytable
where "mytable"."name" like 'BIN-8178%' 
union all
select id
from mytable
where "mytable"."name" like 'BIN-2345%';

regex sql optimization

Answers (1)

Related Questions