spike
spike

Reputation: 10004

Check last refreshed time for materialized view

I've got a materialized view called price_changes used for some reporting. I've also got a cron job refreshing the materialized view with refresh materialized view price_changes. Everything is working great.

I'd like to give users looking at the report a message "Data is fresh as of X". I could store it somewhere when cron runs, but does postgres already store this metadata somewhere?

Upvotes: 38

Views: 32206

Answers (6)

Jesus Iniesta
Jesus Iniesta

Reputation: 12499

From CLI

Let's start by declaring some variables for the materialized view we want to check, and the database it belongs to.

materialized_view_name='my_materialized_view'
database_name='my_database'

Next, we need to find the id of the materialized view. We can do this by querying the pg_class table.

In the followingw query, replace <materialized_view_name> with the name of your materialized view

id=$(psql -d $database_name -c "SELECT relfilenode FROM pg_class WHERE relname = '<materialized_view_name>';" -tA)

Now we can find the file path of the materialized view and, therefore, the last time it was updated.

file_path=$(find / -name "${id}" 2>/dev/null)
ls -l $file_path

Bonus: sometimes servers can have a different time zone to your local machine. You can get a reference to the server's time zone by running the following command.

date

Upvotes: -2

Tomasz
Tomasz

Reputation: 1396

Thank you @ajaest for your help. We had production environment with many instances. We had to prevent materialized vied, not to be refresh quicker then 50 sec. Its our solution. We used COMMENT ON MATERIALIZED VIEW

-- dodanie funkcji do odświeżania widoku
CREATE OR REPLACE FUNCTION public.refresh_mv_users()
    RETURNS timestamp AS $$
DECLARE
    last_refreshed  TIMESTAMP;
    next_refresh    TIMESTAMP;
    comment_sql     text;
BEGIN

    SELECT ((pg_catalog.obj_description('public.mv_users'::regclass, 'pg_class')::json->>'last_refreshed')::timestamp) into last_refreshed;
    SELECT (last_refreshed + (50 * interval '1 seconds')) into next_refresh;

    IF next_refresh < now() THEN
        REFRESH MATERIALIZED VIEW CONCURRENTLY public.mv_users;
        comment_sql := 'COMMENT ON MATERIALIZED VIEW public.mv_users is '|| quote_literal('{"last_refreshed": "' || now() || '"}');
        execute comment_sql;
        SELECT now() into last_refreshed;
    END IF;

    RETURN last_refreshed;
END;
$$  LANGUAGE plpgsql;

Upvotes: 4

rsn86
rsn86

Reputation: 361

WITH
        pgdata AS (
                SELECT
                        setting AS path
                FROM
                        pg_settings
                WHERE
                        name = 'data_directory'
        ),
        path AS (
                SELECT
                        CASE
                                WHEN pgdata.separator = '/' THEN '/'    -- UNIX
                                ELSE '\'                                -- WINDOWS
                        END AS separator
                FROM 
                        (SELECT SUBSTR(path, 1, 1) AS separator FROM pgdata) AS pgdata
        )
SELECT
        ns.nspname||'.'||c.relname AS mview,
        (pg_stat_file(pgdata.path||path.separator||pg_relation_filepath(ns.nspname||'.'||c.relname))).modification AS refresh
FROM
        pgdata,
        path,
        pg_class c
JOIN
        pg_namespace ns ON c.relnamespace=ns.oid
WHERE
        c.relkind='m'
;

Upvotes: 20

ajaest
ajaest

Reputation: 627

I think @richyen solution is the best if you have access to the server.

If you don't you have to resort to manually storing last refresh date as metadata somehow. Some options are:

  • As column as in @thames's response, being the main drawback the extra used storage for the column (repeated as many times as rows have the MV) and the refresh problems in big tables stated by @smudge.

  • In a custom table. The main drawback is that you have to ensure that all refreshed include the table update.

-- Create table
CREATE TABLE pg_matviews_last_refreshed (
  matviewowner NAME NOT NULL,
  matviewname NAME NOT NULL,
  schemaname NAME NOT NULL,
  last_refreshed TIMESTAMP NOT NULL,
  CONSTRAINT pk UNIQUE (matviewowner, matviewname, schemaname)
)

CREATE VIEW pg_matviews_extra AS
SELECT * 
FROM pg_matviews
NATURAL JOIN pg_matviews_last_refreshed

-- Set initial values
INSERT INTO pg_matviews_last_refreshed (
  matviewowner,
  matviewname,
  schemaname,
  last_refreshed
)
SELECT 
  matviewowner,
  matviewname,
  schemaname,
  CURRENT_TIMESTAMP
FROM pg_matviews;

-- Consult dates
SELECT * FROM pg_matviews_extra

-- Refresh 
BEGIN;
REFRESH MATERIALIZED VIEW CONCURRENTLY my_materialized_view;

-- Insert update date into last_refreshed table. Of course, if 
-- more complex permissions are into place, the query becomes more
-- complex to discover owner, schema, etc.
INSERT INTO pg_matviews_last_refreshed (matviewname, matviewowner, schemaname, last_refreshed)
SELECT matviewname, matviewowner, schemaname, CURRENT_TIMESTAMP
FROM pg_matviews
WHERE matviewname='my_materialized_view'
ON CONFLICT (matviewname, matviewowner, schemaname)
DO UPDATE SET last_refreshed=EXCLUDED.last_refreshed;
COMMIT;
  • Similar to having a dedicated table to store last refreshed date, you may store that info implicitly in the MV comment:
-- Refresh operation
BEGIN;
REFRESH MATERIALIZED VIEW CONCURRENTLY my_materialized_view;
-- This statement can be wrapped into a function to allow using CURRENT_TIMESTAMP
COMMENT ON MATERIALIZED VIEW my_materialized_view IS '{"last_refreshed": "2021-07-07T09:30:59Z"}'
COMMIT;


-- Retrieve refresh date
SELECT (pg_catalog.obj_description('my_materialized_view'::regclass, 'pg_class')::json->>'last_refreshed')::timestamp;

Upvotes: 6

richyen
richyen

Reputation: 9978

Since materialized views are segments of data stored on disk, they will have a corresponding file in the filesystem. When you call REFRESH MATERIALIZED VIEW, the data on disk gets re-created under a new filename. As such, you could look up the modification/creation timestamp for the view by cross-referencing with relfilenode in pg_class:

[user@server /]# psql -c "create materialized view myview as select aid from pgbench_accounts where aid < 100"
SELECT 99
[user@server /]# psql -c "select relfilenode from pg_class where relname = 'myview'"
 relfilenode 
-------------
       16445
(1 row)

[user@server /]# ls -l /var/lib/edb/as12/data/base/15369/16445
-rw------- 1 enterprisedb enterprisedb 8192 Jun 14 23:28 /var/lib/edb/as12/data/base/15369/16445
[user@server /]# date
Mon Jun 14 23:29:16 UTC 2021
[user@server /]# psql -c "refresh materialized view myview"
REFRESH MATERIALIZED VIEW
[user@server /]# psql -c "select relfilenode from pg_class where relname = 'myview'"
 relfilenode 
-------------
       16449
(1 row)

[user@server /]# ls -l /var/lib/edb/as12/data/base/15369/16449
-rw------- 1 enterprisedb enterprisedb 8192 Jun 14 23:29 /var/lib/edb/as12/data/base/15369/16449
[user@server /]# 

Upvotes: 13

thames
thames

Reputation: 6083

I don't think there is anything built in the system that provides this as of 9.3.4. When I need to provide the date of last refresh I add a column called 'last_refresh' to the select query in the materialized view since data in the materialized view won't change until it is refreshed.

I also prefer this for security reasons as you may not want to give the sql user access to the system tables, if the information is being stored there.

Depending if you need the time, you can use either:

  1. CURRENT_DATE
  2. now()

Just date:

CREATE MATERIALIZED VIEW mv_address AS 
SELECT *, CURRENT_DATE AS last_refresh FROM address;

With date and time:

CREATE MATERIALIZED VIEW mv_address AS 
SELECT *, now() AS last_refresh FROM address;

Update 2017-02-17:

PostgreSQL version 9.4+ now includes CONCURRENTLY option. If you use REFRESH MATERIALIZED VIEW CONCURRENTLY option be aware of what @Smudge indicated in the comments. This would really only be an issue for large and frequently updated data sets. If your data set is small or infrequently updated then you should be fine.

Upvotes: 22

Related Questions