Reputation: 12107
Being new to Postgres, I have a question regarding caching. I do not know how much is handled by the database, vs. how much I need to handle myself.
I have rows which are part of a time series. For the sake of example, let's say I have 1 row every second.
My query is to get the last minute (60 rows), on a rolling basis (from now - 1min to now).
Ideally, I could cache the last result in the client and request the last second. Unfortunately, the data has holes in it: some rows are missing and me be added a few seconds later. So I need to query the whole thing anyways.
I could find where the holes are and make a query just for these, this is also an option.
What I was wondering is: how much caching does Postgres perform on its own?
If I have a query returning rows 3, 4, 5, 6 and my next query returns 4, 5, 6, 7. Would rows 4, 5, 6 have to be fetched again, or are they cached by the database?
Upvotes: 1
Views: 1759
Reputation: 121594
Postgres includes an extensive caching system. It is very likely that subsequent executions of the same query will use the cached data, but we have virtually no influence on it because the cache works autonomously. What can and should be done is to use prepared statements. Per the documentation:
A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed. This division of labor avoids repetitive parse analysis work, while allowing the execution plan to depend on the specific parameter values supplied.
However, the scenario described in the question suggests a completely different approach to the issue, namely the use of the NOTIFY / LISTEN
feature. In brief:
In this solution, the application performs the query only once and completes the data from notifications.
Sample trigger function:
create or replace function before_insert_on_my_table()
returns trigger language plpgsql as $$
begin
perform pg_notify('my_data', new::text);
return new;
end $$;
create trigger before_insert_on_my_table
before insert on my_table
for each row execute procedure before_insert_on_my_table();
The implementation of the LISTEN
statement in an application depends on the language used. For example, the python psycopg2 has convenient tools for it.
Read more about NOTIFY in the docs.
Upvotes: 1
Reputation: 246383
PostgreSQL caches data automatically. Data are read, written and cached in units of 8kB, and whenever you access a row, the block containing the row will be cached.
All you have to make sure is that you have an index on the timestamp of your table, otherwise PostgreSQL has to read the whole table to compute the result. With an index it will read (and cache) only the blocks that contain the data you need, and that will speed up your next query for these data.
Upvotes: 2