Özenç B.
Özenç B.

Reputation: 1038

PostgreSQL: Pattern matching only whole words

I have a "queries" table that holds hundreds of SQL queries and I am trying to filter out queries that can only be executed on the DB I am using. Because some of these queries refer to tables that exists only in another DB, so only a fraction of them can be executed successfully.

My query so far looks like this:

SELECT rr.name AS query_name,
    (
        SELECT string_agg(it.table_name::character varying, ', ' ORDER BY it.table_name)
        FROM information_schema.tables it
        WHERE rr.config ->> 'queries' SIMILAR TO ('%' || it.table_name || '%')
    )       AS related_tables

FROM queries rr

and it does work fine except the pattern I provided is not the best to filter out edge cases.

Let's say that I have a table called "customers_archived" in the old DB that does not exist in the new one, and a table called "customers" that exists in both the old and the new DB.

Now, with the query I wrote the engine thinks, "Well, I have a table called customers so any query that includes the word customers must be valid", but the engine is wrong because it also picks queries that include "customers_archived" table which does not exist in that DB.

So I tried to match whole words only but I could not get it to work because \ character won't work in PGSQL as far as I am concerned. How can I get this query to do what I am trying to achieve?

Upvotes: 1

Views: 127

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246513

There is no totally reliable way of finding the tables referenced by a query short of building a full PostgreSQL SQL parser. For starters, the name could occur in a string literal, or the query could be

DO $$BEGIN EXECUTE 'SELECT * FROM my' || 'table'; END;$$;

But I think you would be better off if you make sure that there are non-word characters around your name in the match:

WHERE rr.config ->> 'queries' ~ '\y' || it.table_name || '\y'

Upvotes: 1

Related Questions