Refresh a materialized view automatically using a rule or notify

mawimawi picture mawimawi · May 28, 2014 · Viewed 55.2k times · Source

I have a materialized view on a PostgreSQL 9.3 database which seldom changes (about twice a day). But when it does, I'd like to update its data promptly.

Here is what I was thinking about so far:

There is a materialized view mat_view which gets its data from tables table1 and table2 using some join statement.

Whenever something in table1 or table2 changes, I already have a trigger which updates a little configuration table config consisting of

table_name | mat_view_name | need_update
-----------+---------------+------------
table1     | mat_view      | TRUE/FALSE
table2     | mat_view      | TRUE/FALSE

So if anything in table1 changes (there's a trigger on UPDATE and on DELETE for every statement), the field need_update in the first row is set to TRUE. The same goes for table2 and the second row.

Obviously, if need_update is TRUE, then the materialized view must be refreshed.

UPDATE: Since materialized views do not support rules (as @pozs mentioned in a comment below), I would go one step further. I'd create a dummy view v_mat_view with the definition "SELECT * FROM mat_view". When the user does a SELECT on this view, I need to create a rule ON SELECT which does the following:

  • check whether mat_view should be updated (SELECT 1 FROM config WHERE mat_view_name='mat_view' AND need_update=TRUE)
  • reset the need_update flag with UPDATE config SET need_update=FALSE where mat_view_name='mat_view'
  • REFRESH MATERIALIZED VIEW mat_view
  • and at last do the original SELECT statement but with mat_view as the target.

UPDATE2: I tried creating the steps above:

Create a function that handles the four points mentioned above:

CREATE OR REPLACE FUNCTION mat_view_selector()
RETURNS SETOF mat_view AS $body$
BEGIN
  -- here is checking whether to refresh the mat_view
  -- then return the select:
  RETURN QUERY SELECT * FROM mat_view;
END;
$body$ LANGUAGE plpgsql;

Create the view v_mat_view which really selects from the function mat_view_selector:

CREATE TABLE v_mat_view AS SELECT * from mat_view LIMIT 1;
DELETE FROM v_mat_view;

CREATE RULE "_RETURN" AS
    ON SELECT TO v_mat_view
    DO INSTEAD 
        SELECT * FROM mat_view_selector();
    -- this also converts the empty table 'v_mat_view' into a view.

The result is unsatisfying:

# explain analyze select field1 from v_mat_view where field2 = 44;
QUERY PLAN
Function Scan on mat_view_selector (cost=0.25..12.75 rows=5 width=4)
(actual time=15.457..18.048 rows=1 loops=1)
Filter: (field2 = 44)
Rows Removed by Filter: 20021
Total runtime: 31.753 ms

in comparison to selecting from the mat_view itself:

# explain analyze select field1 from mat_view where field2 = 44;
QUERY PLAN
Index Scan using mat_view_field2 on mat_view (cost=0.29..8.30 rows=1 width=4)
  (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (field2 = 44)
Total runtime: 0.036 ms

So essentially it DOES work, but performance might be an issue.

Anyone have better ideas? If not, then I would have to implement it somehow in the application logic or worse: run a simple cronjob that runs every minute or so.

Answer

klin picture klin · May 30, 2014

You should refresh the view in triggers after insert/update/delete/truncate for each statement on table1 and table2.

create or replace function refresh_mat_view()
returns trigger language plpgsql
as $$
begin
    refresh materialized view mat_view;
    return null;
end $$;

create trigger refresh_mat_view
after insert or update or delete or truncate
on table1 for each statement 
execute procedure refresh_mat_view();

create trigger refresh_mat_view
after insert or update or delete or truncate
on table2 for each statement 
execute procedure refresh_mat_view();

In this way your materialized view is always up to date. This simple solution might be hard to accept with frequent inserts/updates and sporadic selects. In your case (seldom changes about twice a day) it ideally fits your needs.


To realize deferred refresh of a materialized view you need one of the following features:

  • asynchronous trigger
  • trigger before select
  • rule on select before

Postgres has none of them, so it seems that there is no clear postgres solution.

Taking this into account I would consider a wrapper function for selects on mat_view, e.g.

CREATE OR REPLACE FUNCTION select_from_mat_view(where_clause text)
RETURNS SETOF mat_view AS $body$
BEGIN
  -- here is checking whether to refresh the mat_view
  -- then return the select:
  RETURN QUERY EXECUTE FORMAT ('SELECT * FROM mat_view %s', where_clause);
END;
$body$ LANGUAGE plpgsql;

If it is acceptable in practice depends on particulars I do not know about.