Reputation: 1038
I have a "queries" table that holds hundreds of SQL queries and I am trying to filter out queries that can only be executed on the DB I am using. Because some of these queries refer to tables that exists only in another DB, so only a fraction of them can be executed successfully.
My query so far looks like this:
SELECT rr.name AS query_name,
(
SELECT string_agg(it.table_name::character varying, ', ' ORDER BY it.table_name)
FROM information_schema.tables it
WHERE rr.config ->> 'queries' SIMILAR TO ('%' || it.table_name || '%')
) AS related_tables
FROM queries rr
and it does work fine except the pattern I provided is not the best to filter out edge cases.
Let's say that I have a table called "customers_archived" in the old DB that does not exist in the new one, and a table called "customers" that exists in both the old and the new DB.
Now, with the query I wrote the engine thinks, "Well, I have a table called customers so any query that includes the word customers must be valid", but the engine is wrong because it also picks queries that include "customers_archived" table which does not exist in that DB.
So I tried to match whole words only but I could not get it to work because \ character won't work in PGSQL as far as I am concerned. How can I get this query to do what I am trying to achieve?
Upvotes: 1
Views: 127
Reputation: 246513
There is no totally reliable way of finding the tables referenced by a query short of building a full PostgreSQL SQL parser. For starters, the name could occur in a string literal, or the query could be
DO $$BEGIN EXECUTE 'SELECT * FROM my' || 'table'; END;$$;
But I think you would be better off if you make sure that there are non-word characters around your name in the match:
WHERE rr.config ->> 'queries' ~ '\y' || it.table_name || '\y'
Upvotes: 1