yingzhox
yingzhox

Reputation: 307

How to manually update the statistics data of tables in PostgreSQL

The ANALYZE statement can be used in PostgreSQL to collect the statistics data of tables. However, I do not want to actually insert these data into tables, I just need to evaluate the cost of some queries, is there anyway to manually specify the statistics data of tables in PostgreSQL without actually putting data into it?

Upvotes: 5

Views: 29325

Answers (2)

Raphaël Braud
Raphaël Braud

Reputation: 1519

You can already EXPLAIN ANALYSE a query even with no inserted data, it will help you to get a feeling of the execution plan.

But there's no such thing as real data :) What you can do, as a workaround, is BEGINning a transaction, INSERT some data, EXPLAIN ANALYSE your query, then ROLLBACK your transaction.

Example :

mydatabase=# BEGIN;
BEGIN
mydatabase=# INSERT INTO auth_message (user_id, message) VALUES (1, 'foobar');
INSERT 0 1
mydatabase=# EXPLAIN ANALYSE SELECT count(*) FROM auth_message;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24.50..24.51 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=1)
   ->  Seq Scan on auth_message  (cost=0.00..21.60 rows=1160 width=0) (actual time=0.007..0.008 rows=1 loops=1)
 Total runtime: 0.042 ms
(3 lignes)

mydatabase=# ROLLBACK;
ROLLBACK
mydatabase=# EXPLAIN ANALYSE SELECT count(*) FROM auth_message;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=24.50..24.51 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=1)
   ->  Seq Scan on auth_message  (cost=0.00..21.60 rows=1160 width=0) (actual time=0.009..0.009 rows=0 loops=1)
 Total runtime: 0.043 ms
(3 lignes)

The 1st EXPLAIN ANALYSE shows that there was some "temporary" data (rows=1)

This is not strictly a "mock", but at least, PostgreSQL plan execution (and various optimizations it could do) should be, IMHO, best than with no data (disclaimer : purely intuitive)

Upvotes: 2

Craig Ringer
Craig Ringer

Reputation: 324425

I think you are muddling ANALYZE with EXPLAIN ANALYZE. There are different things.

If you want query costs and timing without applying the changes, the only real option you have is to begin a transaction, execute the query under EXPLAIN ANALYZE, and then ROLLBACK.

This still executes the query, meaning that:

  • CPU time and I/O are consumed
  • Locks are still taken and held for the duration
  • New rows are actually written to the tables and indexes, but are never marked visible. They are cleaned up in the next VACUUM.

Upvotes: 8

Related Questions