Reputation: 7448
Is there a simple PostgreSQL or even SQL way of listing empty/not empty tables?
P.S.: I'm analyzing a database containing hundreds of tables and would like to detect "death code". I assume, when the table after some month is still empty, than it's not used.
EDIT:Solved
Thank you all! Finally this statement seems to output the statistics I can use:
select schemaname, relname, n_tup_ins from pg_stat_all_tables WHERE schemaname = 'public' ORDER BY n_tup_ins
Upvotes: 25
Views: 17574
Reputation:
Checking for the number of rows could give you wrong results. Assume that a table is used as a staging table: rows get inserted (e.g. from a flat file), processed and deleted. If you check the number of rows in that table you could very well believe it's never used if you don't happen to run your query while the processing takes place.
Another way to detect "unused" tables would be to monitor the IO and changes that are done to the tables.
The statistic view pg_stat_user_tables records changes (deletes, inserts, updates) to each table in the system. The statistic view pg_statio_user_tables records IO done against the tables.
If you take snapshots of those tables in regular intervals you can calculate the difference in the values and see if a tables is used at all.
You can use pg_stat_reset()
to reset all values to zero and then start from that.
Upvotes: 8
Reputation: 2097
You could use PostgreSQL's system catalogs with e.g.
SELECT n.nspname, c.relname
FROM pg_class c
INNER JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE c.reltuples = 0 AND c.relkind = 'r';
According to the documentation, the number of rows is an estimate, though.
If your tables have columns that take their default values from sequences, you could list them and check their values with nextval
. (Unfortunately, currval
returns a session-dependent value, so you'd have to ensure that no one else is using the database and use both nextval
and setval
.)
SELECT n.nspname, c.relname
FROM pg_class c
INNER JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE c.relkind = 'S';
(Unfortunately I couldn't yet find any way to determine, which sequence belongs to which table. Obviously it would be very helpful. Anyway, you can use pg_class.relnamespace
to narrow down the results.)
See http://www.postgresql.org/docs/9.3/interactive/catalogs-overview.html for details.
Upvotes: 13
Reputation: 2155
You can do this
CREATE OR REPLACE FUNCTION fn_table()
RETURNS TABLE(name text,count int) AS
$BODY$
DECLARE
data record;
v_sql text;
BEGIN
DROP TABLE IF EXISTS demo;
CREATE TEMP TABLE demo (name text,count int);
FOR data in (SELECT table_name FROM INFORMATION_SCHEMA.tables WHERE table_schema = 'public') LOOP
v_sql := 'INSERT INTO demo SELECT '''||data.table_name||''', COUNT(*) FROM '||data.table_name;
RAISE INFO 'v_sql:%',v_sql;
EXECUTE v_sql;
END LOOP;
RETURN QUERY (SELECT * FROM demo);
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
OUTPUT :
"child";0
"location_master";6
"location";5
"tmp";5
"a1";7
"b1";3
"master";0
"child2";0
"child1";0
Here child, master , child1 , child2 has 0 count that means they are empty tables.
Upvotes: 3
Reputation: 385
Please try this once,hope it helps you:
Running the query below will give you a list of tables and indexes that have not been used since SQL Server was last restarted. Once you have a list of tables, you can do a dependency check to get a list of stored procedures that use each table. Then you can search your C# source code for thos SPs and table names.
-- Unused tables & indexes. Tables have index_id’s of either 0 = Heap table or 1 = Clustered Index
SELECT OBJECTNAME = OBJECT_NAME(I.OBJECT_ID), INDEXNAME = I.NAME, I.INDEX_ID
FROM SYS.INDEXES AS I
INNER JOIN SYS.OBJECTS AS O
ON I.OBJECT_ID = O.OBJECT_ID
WHERE OBJECTPROPERTY(O.OBJECT_ID,'IsUserTable') = 1
AND I.INDEX_ID
NOT IN (SELECT S.INDEX_ID
FROM SYS.DM_DB_INDEX_USAGE_STATS AS S
WHERE S.OBJECT_ID = I.OBJECT_ID
AND I.INDEX_ID = S.INDEX_ID
AND DATABASE_ID = DB_ID(db_name()))
ORDER BY OBJECTNAME, I.INDEX_ID, INDEXNAME ASC
Upvotes: -2