Cool Stuff in PostgreSQL 10: Transition Table Triggers

Last time, we created a heirarchy of tables for logging, but we don't have anything they'd log, and we don't have a way to ensure the logging happens automagically.  This time, we'll fix those problems.

First, a thing to log:

CREATE TABLE foo(
    id BIGSERIAL PRIMARY KEY,
    t TEXT
);

Next, let's make an efficient trigger function.  You may have seen one of these before, but it was probably row-based, so it couldn't be super efficient.  This one is statement-based, and uses the new transition table feature in PostgreSQL 10.  We only need to build one trigger function because our logging infrastructure is schema-agnostic.  We could have separated it into three trigger functions, but that's just more code to maintain.  It starts off as usual:



CREATE OR REPLACE FUNCTION log()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $$
BEGIN
    IF TG_OP NOT IN ('INSERT', 'UPDATE', 'DELETE') THEN
        RAISE EXCEPTION 'This function should not fire on %', TG_OP;
    END IF;

    IF TG_OP = 'INSERT' THEN
        INSERT INTO the_log (
            action, table_schema,    table_name, new_row
        )
        SELECT
            TG_OP,  TG_TABLE_SCHEMA, TG_RELNAME, row_to_json(new_table)::jsonb
        FROM
            new_table;

While the rest of it should be familiar, new_table is new.  It's a transition table, which contains the rows as they looked after the action, which is an INSERT.  We've picked an arbitrary name for it, and we'll see how to do that below.  One operation done, two to go.  This next looks pretty similar:

    ELSIF TG_OP = 'DELETE' THEN
        INSERT INTO the_log (
            action, table_schema,    table_name, old_row
        )
        SELECT
            TG_OP,  TG_TABLE_SCHEMA, TG_RELNAME, row_to_json(old_table)::jsonb
        FROM
            old_table;



The other transition table, old_table, comes into play here.  For UPDATE, we'll use both:

    ELSE
        /*
         *  DANGER, WILL ROBINSON!  DANGER!
         *  This implementation assumes based on current implementation details
         *  that old_table and new_table have identical orderings.  Should that
         *  implementation detail change, this could get a lot more complicated.
         */

        WITH
            o AS (

                SELECT
                    row_to_json(old_table)::jsonb AS old_row,
                    row_number() OVER () AS ord
                 FROM old_table
            ),
            n AS (

            SELECT
                row_to_json(new_table)::jsonb AS new_row,
                row_number() OVER () AS ord
            FROM new_table
        )
        INSERT INTO the_log (
            action, table_schema,    table_name, old_row, new_row
        )
        SELECT
            TG_OP,  TG_TABLE_SCHEMA, TG_RELNAME, old_row, new_row
        FROM
            o
        JOIN
            n
            USING(ord);
    END IF;
    RETURN NULL;
END;
$$;


This may look a little intimidating, but it's just a way to make sure we've got corresponding rows without mandating primary keys that are never updated.

Next, triggers that call the function.  Because we need the non-empty combinations of old_table and new_table, we'll need three of them:


CREATE TRIGGER log_insert_public_foo
    AFTER INSERT ON public.foo
    REFERENCING NEW TABLE AS new_table
    FOR EACH STATEMENT
        EXECUTE PROCEDURE public.log();


CREATE TRIGGER log_update_public_foo
    AFTER UPDATE ON public.foo
    REFERENCING OLD TABLE AS old_table NEW TABLE AS new_table
    FOR EACH STATEMENT
        EXECUTE PROCEDURE public.log();


CREATE TRIGGER log_delete_public_foo
    AFTER DELETE ON public.foo
    REFERENCING OLD TABLE AS old_table
    FOR EACH STATEMENT
        EXECUTE PROCEDURE public.log();

Now, we can INSERT (or COPY), UPDATE, and DELETE from the foo table and know we're capturing the change stream.

One little problem, addressed in our third and final episode of this series, is that there was an awful lot of boilerplate code.  We'll DRY it up.

Comments

  1. As far as you know, this feature is still in for PG10? Last I saw there was a little bit of an argument going on with it regarding bugs which have come up, and patches not being reviewed in a timely manner. I am really looking forward to this feature going in, so I have my fingers crossed that those issues can be solved.

    ReplyDelete
    Replies
    1. It's going in as far as I can tell. At this point, something catastrophic would have to occur to remove it for 10.

      Delete
  2. Set oriented triggers at last. Wish I could use RETURNING * INTO (new|old)_table in regular SQL statements without involving triggers. This would be a game changer in functions. Today you can do it inside the same SQL statement (using WITH), but not outside it, within the same function scope and transaction.

    ReplyDelete
    Replies
    1. You can do NEW (essentially) with RETURNING. RETURNING OLD is now a relatively small matter of tuits (the round kind) and could easily be in 11.

      Delete
  3. The transition table seems to be like OLD and NEW tables in Sybase/SQLServer, which contains the before and after image of the rows updated by the current SQL and are available for application.
    I used it 12 yrs ago in Sybase and always found it cool. I don't think Oracle or DB2 offers something similar.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Nice post thanks! Useful feauture.

    ReplyDelete
  6. Very useful! Currently we have a Cache invalidation solution with Row-Triggers (and pg_notify).
    We hope/think this new feature will improve the performance.

    ReplyDelete
    Replies
    1. I hope a lot of people create nice things with it. It's been a long time coming.

      Delete

Post a Comment

Popular posts from this blog

Cool Stuff in PostgreSQL 10: Auto-logging

Cool Stuff in PostgreSQL 10: Partitioned Audit Table