PostgreSQL's transition relations.

PostgreSQL’s transition relations.

Great news! PostgreSQL 10 is out. Release notes look amazing and there are a lot of new, wonderful changes and additions. In my personal list, the most desirable are improvements related to parallel queries which have been introduced in 9.6. In the new Postgres, more operations can be executed in parallel, such as B-tree index scan, bitmap heap scan, merge joins and non-correlated subqueries. Of course, new Postgres has other great features such as tables’ publication/subscription based on logical replication and declarative partitioning.

Here, I would like to shed some light on another novation related to the trigger functions – transition relations. What does the official documentation say about transition relations? These are sets of rows that include all of the rows inserted, deleted, or modified by the current SQL statement. Key point here is – all. With transition relations, triggers have a global overview of statement’s output, not just one row at a time. Note, that transition relations are only used in conjunction with AFTER statement triggers.

Below I’m going to consider a simple example where it can be used.
Here are two tables, the first one is the insert-only table with historical data, the second – stores average aggregates based on historical data from the first table.

# CREATE TABLE trendbar ( period INTEGER, create_date TIMESTAMP, open BIGINT, high BIGINT, low BIGINT, close BIGINT, volume BIGINT, symbol_id BIGINT ); # CREATE TABLE trendbar_avg ( create_date TIMESTAMP, avg_open NUMERIC(10,2), avg_high NUMERIC(10,2), avg_low NUMERIC(10,2), avg_close NUMERIC(10,2), sum_volume BIGINT, symbol_list BIGINT[] );

Let’s imagine that our application populates this table with the following query:

# INSERT INTO trendbar SELECT 1 AS period, now() AS create_date, (random() * 10000)::bigint AS open, (random() * 10000)::bigint AS high, (random() * 1000)::bigint AS low, (random() * 10000)::bigint AS close, (random() * 100000)::bigint AS volume, (random() * 9+1)::bigint AS symbol_id FROM generate_series(1,10);

Number of inserted rows may vary, and after each insert operation we need to calculate some aggregates which are used by another application,a report or something else.

Currently, the only way to “reach” rows being acted in the source table is to use FOR EACH ROW trigger. Quite often series of changes on the source table will act on just a single row in the aggregated table. In FOR EACH ROW case, database is forced to re-calculate aggregated value and update that single row multiple times – as many as there are operations in the source. This, unfortunately, leads to the table (and indexes) to bloat. Transition relations don’t have this disadvantage and allow to cumulatively handle rows in that source.

This task is easily solved using AFTER STATEMENT trigger with transition table!

As usual, before adding trigger, a trigger function has to be created. In this simple function we use SELECT query which reads rows from transition relation and generates aggregated values. Transition table is named as “inserted”, but it isn’ta strict convention and you are able to rename it.

# CREATE OR REPLACE FUNCTION fn_tg_update_avg() RETURNS trigger as

$$

BEGIN

IF TG_OP = 'INSERT'

THEN EXECUTE

'INSERT INTO trendbar_avg

 SELECT create_date, avg(open) avg_open, avg(high) avg_high, avg(low) avg_low, avg(close) avg_close, sum(volume) sum_volume, array_agg(symbol_id order by symbol_id) symbol_list

FROM inserted

GROUP BY create_date ';

RETURN NEW;

ELSE

RAISE EXCEPTION 'Only INSERT operations allowed';

RETURN NULL;

END IF;

RETURN NULL;

END;

$$ LANGUAGE plpgsql;

When function is created, the last step is to create a trigger. In the trigger declaration we must specify REFERENCING keyword and transition table. Here, I used NEW TABLE keyword because I’ve just insert rows. With UPDATE or DELETE operations it’s possible to use transition tables that contain old rows’ versions – before DELETE or UPDATE operation. Also, there is one thing to keep in mind, transition relations can’t be specified for triggers with more than one event, for example it isn’t possible to handle INSERT or UPDATE or DELETE commands using one trigger.

# CREATE TRIGGER tg_trendbar_update_averages AFTER INSERT ON trendbar REFERENCING NEW TABLE inserted FOR EACH STATEMENT EXECUTE PROCEDURE fn_tg_update_avg();

It’s quite simple. Now try inserting new data into the first table with INSERT query which used above and then check the second one.

# SELECT * FROM trendbar_avg ORDER BY create_date;

        create_date        | avg_open| avg_high| avg_low| avg_close| sum_volume |      symbol_list        
---------------------------+---------+---------+--------+----------+------------+------------------------
2017-10-06 11:44:07.190285 | 4992.50 | 6660.60 | 530.90 |  5609.60 |     632176 | {1,3,4,4,5,6,7,8,8,9}
2017-10-06 11:44:17.341953 | 4943.30 | 3435.70 | 654.50 |  4545.20 |     497750 | {3,3,4,5,5,6,7,8,9,9}
2017-10-06 11:44:20.485938 | 4416.40 | 6137.10 | 508.80 |  5575.70 |     472368 | {3,6,7,7,8,8,9,9,9,10}

It works! So transition tables look neat and useful in cases when data should be post-processed after its modification.

Another usage for this type of triggers is the cases when you would like to know the exact number of rows for large tables. Usually, we would create each row trigger and update table while maintaining exact values for “count ()”.Using transition relations, Postgres allows you to calculate changes for “count ()” only once in each statement trigger and it helps to reduce the load on the database in case of FOR EACH ROW statement triggers.

If you have any other ideas please do share them in the comments.

And all in all, enjoy PostgreSQL 10!

Comments: 1

One response to “PostgreSQL’s transition relations.”

Unknown says:

October 7, 2017 at 1:37 am

I'm really excited about this. I hope that some more work goes into making really powerful incrementally refreshed materialized views now that this is in.

Reply

Package	Price*
PostgreSQL configuration analysis	from 500 €
Hardware and OS configuration analysis	from 600 €
Slow SQL queries analysis	from 500 €
Comprehensive performance Health Check*	from 3500 €
* For single PostgreSQL installation (1 master + 1 replica)

Basic	Premium	VIP
Incl. up to 5h/month DBA time	Incl. up to 15h/month DBA time	Incl. up to 30h/month DBA time
Slack, Mattermost, Telegram	Slack, Mattermost, Telegram	Slack, Mattermost, Telegram
Response time — 1 Hour	Response time — 1 Hour	Response time — 30 minutes
Up to 2 PostgreSQL instances	Up to 5 PostgreSQL instances	Up to 10 PostgreSQL instances

News & Blog back

PostgreSQL’s transition relations.

You may also like:

Back from PGConf Belgium 2024

Back from PGConf.DE 2024

The importance of PostgreSQL timelines

Back from FOSDEM 2024

News & Blog back

PostgreSQL’s transition relations.

You may also like:

Back from PGConf Belgium 2024

Back from PGConf.DE 2024

The importance of PostgreSQL timelines

Back from FOSDEM 2024

Experience required:

Key responsibilities:

Why choose Data Egret as your place of work?

We offer:

– PHASE 1 –

Audit and Optimisation

During this phase our DBA will be answering any questions regarding:

Training

– PHASE 2 –

Ongoing support

Who will you be working with?

What skills will you gain?

What will your job be like?

What do we expect from you?

What skills would be a plus?

Why choose Data Egret as your place of work?

We offer:

Questions?