Reputation: 227
I have a schema myschema
in which I have dozens of tables, each containing a specific column maincolumn
. I would like to create a statistics of the whole schema, where I can see how many distinct values are in the maincolumn
per each table.
I know I can list the tables in a schema as a result of this query:
SELECT tablename
FROM pg_tables
WHERE schemaname = 'myschema'
Results:
tablename
---------
table1
table2
table3
...
Also I can run the query for each table:
SELECT COUNT(DISTINCT(maincolumn))
FROM myschema.table1
Results:
count
-----
2972
But I wonder if there's a way to join these two tasks into one query, which would result this:
tablename | count
----------+------
table1 | 2972
table2 | 542
table3 | 1523
....
A not so elegant solution would be to list all my tables, put the result in an Excel table and use CONCATENATE
function to create a SELECT
query to every table, and then I could run it line by line, but I would prefer an easier way.
Upvotes: 1
Views: 286
Reputation: 246393
You would need dynamic SQL for that, probably in a PL/pgSQL function like this:
CREATE OR REPLACE FUNCTION table_counts() RETURNS TABLE (table_name text, count bigint)
LANGUAGE plpgsql AS
$$BEGIN
FOR table_counts.table_name IN
SELECT t.table_name
FROM information_schema.tables AS t
WHERE t.table_schema = 'myschema'
AND t.table_type = 'BASE TABLE'
LOOP
EXECUTE format(
'SELECT count(*) FROM myschema.%I',
table_counts.table_name
) INTO table_counts.count;
RETURN NEXT;
END LOOP;
END;$$;
Also, this is quite expensive.
If approximate counts are good enough for you, try
SELECT c.relname, c.reltuples
FROM pg_catalog.pg_class AS c
JOIN pg_catalog.pg_namespace AS n
ON c.relnamespace = n.oid
WHERE n.nspname = 'myschema'
AND c.relkind = 'r';
The result becomes more accurate if you run ANALYZE
first.
Upvotes: 2