Avoiding Unnecessarily Running Triggers in Postgres

In Postgres (or any relational db), triggers can be useful:

1
create table books (
2
id serial primary key not null,
3
name text not null,
4
updated_at timestamptz default now()
5
);
6
7
create or replace function update_updated_at_column () returns trigger as $$
8
begin
9
new.updated_at = now();
10
return new;
11
end;
12
$$ language plpgsql;
13
14
create trigger update_books_updated_at before
15
update on books for each row
16
execute function update_updated_at_column ();
17
18
insert into books (name) values ('The Great Gatsby');
19
20
select name,updated_at from books;

nameupdated_at
The Great Gatsby2024-08-24 17:58:09.87828-04

1
update books
2
set name = 'THE GREAT GATSBY'
3
where id = 1
4
returning name,updated_at;

nameupdated_at
THE GREAT GATSBY2024-08-24 18:01:21.313217-04

But, they can also cause performance issues. To show this, we’re going to create a pyramid scheme.

Case Study: A Pyramid Scheme

  • Each person must sell $100 of goods
  • Each person will attempt to bring in other people.

We’ll define revenue as how much one individual has sold, and referred_revenue as how much anyone they’ve referred has sold. We’ll take it a step further though, and we’ll give you credit for anyone who’s been referred by someone you referred, and so on.

As the creator of the scheme, my revenue + referred_revenue is the total revenue of the company. Everybody is referred by me, or by somebody referred by me.

Let’s express our pyramid scheme in a schema:

1
create table members (
2
id serial primary key,
3
name text not null,
4
revenue numeric(10, 2) not null default 0,
5
referred_revenue numeric(10, 2) not null default 0,
6
7
referred_by integer references members (id)
8
);
9
10
create or replace function update_referred_revenue () returns trigger as $$
11
begin
12
-- if at top of pyramid, stop
13
if new.referred_by is null then
14
return new;
15
end if;
16
17
-- apply any change to the next member up the pyramid
18
update members
19
set referred_revenue =
20
referred_revenue
21
+ new.referred_revenue - coalesce(old.referred_revenue, 0)
22
+ new.revenue - coalesce(old.revenue, 0)
23
where id = new.referred_by;
24
25
return new;
26
end;
27
$$ language plpgsql;
28
29
create trigger update_referred_revenue after
30
insert or update on members for each row
31
execute function update_referred_revenue ();

Now, let’s see it in action:

1
insert into members (name, revenue) values ('me', 100);
2
insert into members (name, revenue, referred_by) values ('Stella', 100, 1);
3
insert into members (name, revenue, referred_by) values ('Cornelius', 100, 1);
4
insert into members (name, revenue, referred_by) values ('Rita', 100, 2);
5
insert into members (name, revenue, referred_by) values ('Conrad', 100, 2);
6
insert into members (name, revenue, referred_by) values ('Helmut', 100, 3);
7
insert into members (name, revenue, referred_by) values ('Manfried', 100, 3);

We can see referred_revenue cascading up the pyramid:

NameRevenueReferred Revenue
me100.00600.00
Stella100.00200.00
Cornelius100.00200.00
Rita100.000.00
Conrad100.000.00
Helmut100.000.00
Manfried100.000.00

This is a nice approach because it’s handling recursion for us.

The Problem

Now, let’s say Cornelius gets a million new recruits, but they haven’t sold anything yet.

1
insert into members (name, revenue, referred_by)
2
select 'member' || i::text, 0, 3
3
from generate_series(1, 1000000) as i;

That’s taking minutes to run… we can’t have that. Our pyramid scheme requires maximum efficiency!

Fortunately, we know they haven’t sold anything, so we don’t need to update anyone’s referred_revenue. We have 2 options.

Solution 1: Disable the Trigger

1
alter table members disable trigger update_referred_revenue;
2
3
insert into members (name, revenue, referred_by)
4
select 'member' || i::text, 0, 3
5
from generate_series(1, 1000000) as i;
6
7
alter table members enable trigger update_referred_revenue;

That runs in seconds. We could’ve also disabled all triggers on the table:

1
alter table members disable trigger all;

Solution 2: Conditionally Run the Trigger

To do this we’ll need to create two triggers, one for insert and one for update - the logic in each when condition will be different.

1
drop trigger update_referred_revenue on members;
2
3
create trigger update_referred_revenue_on_update
4
after update on members
5
for each row
6
when (
7
old.revenue != new.revenue
8
or old.referred_revenue != new.referred_revenue
9
) execute function update_referred_revenue ();
10
11
create trigger update_referred_revenue_on_insert
12
after insert on members
13
for each row
14
when (
15
new.revenue != 0
16
or new.referred_revenue != 0
17
) execute function update_referred_revenue ();

And now our new members can join the scheme.

1
insert into members (name, revenue, referred_by)
2
select 'member' || i::text, 0, 3
3
from generate_series(1, 1000000) as i;

Conclusion

Being so specific about when a trigger runs can seem unnecessary, but as the pyramid scheme example shows, triggers can basically block bulk operations.

In situations where a trigger needs to run a query, I’ve found them to be a main culprit of performance issues.

Wow! You read the whole thing. People who make it this far sometimes want to receive emails when I post something new.

I also have an RSS feed.