Reputation: 10189
We are running on PostgreSQL version 9.1, previously we had over 1Billion rows in one table and has been deleted. However, it looks like the \l+
command still reports inaccurately about the actual database size (it reported 568GB but in reality it's much much less than).
The proof of that 568GB is wrong is that the individual table size tally didn't add up to the number, as you can see, top 20 relations has 4292MB in size, the remaining 985 relations are all well below 10MB. In fact all of them add up to about less than 6GB.
Any idea why PostgreSQL so much bloat? If confirmed, how can I debloat? I am not super familiar with VACUUM
, is that what I need to do? If so, how?
Much appreciate it.
pmlex=# \l+
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
-----------------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
pmlex | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 568 GB | pg_default |
pmlex_analytics | pmlex | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 433 MB | pg_default |
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 5945 kB | pg_default | default administrative connection database
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | unmodifiable empty database
| | | | | postgres=CTc/postgres | | |
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 5841 kB | pg_default | default template for new databases
| | | | | postgres=CTc/postgres | | |
(5 rows)
pmlex=# SELECT nspname || '.' || relname AS "relation",
pmlex-# pg_size_pretty(pg_relation_size(C.oid)) AS "size"
pmlex-# FROM pg_class C
pmlex-# LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
pmlex-# WHERE nspname NOT IN ('pg_catalog', 'information_schema')
pmlex-# ORDER BY pg_relation_size(C.oid) DESC;
relation | size
-------------------------------------+---------
public.page_page | 1289 MB
public.page_pageimagehistory | 570 MB
pg_toast.pg_toast_158103 | 273 MB
public.celery_taskmeta_task_id_key | 233 MB
public.page_page_unique_hash_uniq | 140 MB
public.page_page_ad_text_id | 136 MB
public.page_page_kn_result_id | 125 MB
public.page_page_seo_term_id | 124 MB
public.page_page_kn_search_id | 124 MB
public.page_page_direct_network_tag | 124 MB
public.page_page_traffic_source_id | 123 MB
public.page_page_active | 123 MB
public.page_page_is_referrer | 123 MB
public.page_page_category_id | 123 MB
public.page_page_host_id | 123 MB
public.page_page_serp_id | 121 MB
public.page_page_domain_id | 120 MB
public.celery_taskmeta_pkey | 106 MB
public.page_pagerenderhistory | 102 MB
public.page_page_campaign_id | 89 MB
...
...
...
pg_toast.pg_toast_4354379 | 0 bytes
(1005 rows)
Upvotes: 5
Views: 8382
Reputation: 31
it will likely look different if you use pg_total_relation_size instead of pg_relation_size
pg_relation_size doesn't give the total size of the table, see
https://www.postgresql.org/docs/9.5/static/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE
Upvotes: 0
Reputation: 9756
Your options include:
1). Ensuring autovacuum is enabled and set aggressively.
2). Recreating the table as I mentioned in an earlier comment (create-table-as-select + truncate + reload the original table).
3). Running CLUSTER on the table if you can afford to be locked out of that table (exclusive lock).
4). VACUUM FULL, though CLUSTER is more efficient and recommended.
5). Running a plain VACUUM ANALYZE a few times and leaving the table as-is, to eventually fill the space back up as new data comes in.
6). Dump and reload the table via pg_dump
7). pg_repack (though I haven't used it in production)
Upvotes: 3