Sunday, April 30, 2006

MySQL hack emulates BEFORE statement-level triggers

MySQL 5 supports row-level triggers. This means you can have the server execute a SQL (or SQL/PSM) statement just before or after a row is inserted, updated or deleted. Because these triggers are fired in response to an operation occurring for a single row, these are called row-level triggers.

The SQL Standard also defines the syntax and semantics for triggers that are to be fired once for an entire statement, regardless of the number of rows that is touched by that statement. Unsurprisingly, these are called statement-level triggers.

MySQL does not support statement-level triggers. However, there's a quick hack to emulate BEFORE STATEMENT triggers.

Using the ROW_COUNT function


The hack depends on the ROW_COUNT() function. This function returns the number of rows updated, inserted, or deleted by the preceding statement. An example update statement for the sakila database might illustrate that:

UPDATE sakila.category
SET name = UCASE(name)
;

Query OK, 16 rows affected (0.22 sec)
Rows matched: 16 Changed: 16 Warnings: 0

SELECT row_count();
+-------------+
| row_count() |
+-------------+
| 16 |
+-------------+
1 row in set (0.04 sec)

(BTW, you might want to backup your sakila installation and restore it or find some other way to undo the changes performed by the UPDATE statement.)

So, the update statement updated all 16 rows in the category table.

Using ROW_COUNT inside TRIGGERs


I wondered how ROW_COUNT() would react if it would be called from within a conventional MySQL row-level trigger, so I decided to test it.

First, we'll set up a table for which we want to define the triggers:

CREATE TABLE test_row_count(
id int
);


To log trigger activity for operations occurring on this table, we'll need another table:

CREATE TABLE test_row_count_log(
id int AUTO_INCREMENT PRIMARY KEY
, test_row_count_id int
, trigger_event enum('DELETE','INSERT','UPDATE')
, trigger_time enum('BEFORE','AFTER')
, row_count int
);


Now we can create triggers for each of the three DML events insert, update and delete, and for each of these events, we can write both a before and a after tigger. So, potentially, we could write up to 3 * 2 = 6 different triggers. Let's start out with the two INSERT triggers and see what happens:

delimiter //

CREATE TRIGGER bir_test_row_count
BEFORE INSERT ON test_row_count
FOR EACH ROW
INSERT
INTO test_row_count_log(
test_row_count_id
, trigger_event
, trigger_time
, row_count
) VALUES (
new.id
, 'INSERT'
, 'BEFORE'
, row_count()
);
//

CREATE TRIGGER air_test_row_count
AFTER INSERT ON test_row_count
FOR EACH ROW
INSERT
INTO test_row_count_log(
test_row_count_id
, trigger_event
, trigger_time
, row_count
) VALUES (
new.id
, 'INSERT'
, 'AFTER'
, row_count()
);
//

delimiter ;

Now let's see what happens when we insert a row into the test_row_count table.:

INSERT
INTO test_row_count(
id
) VALUES (
1
)
;

Executing this INSERT will fire both the triggers exactly once, so we expect to see two rows inside the test_row_count_log table.

We expect to see one row in test_row_count_log corresponding to the instant immediately before a row was created in test_row_count; we also expect one row corresponding to the moment immediately after the insert on test_row_count occurred. We're most interested in the value stored in the row_count column of the test_row_count_log table, as this would've captured the value returned by the ROW_COUNT() function. So:

SELECT *
FROM test_row_count_log
ORDER BY id
;

Here's what happened:

+----+-------------------+---------------+--------------+-----------+
| id | test_row_count_id | trigger_event | trigger_time | row_count |
+----+-------------------+---------------+--------------+-----------+
| 1 | 1 | INSERT | BEFORE | -1 |
| 2 | 1 | INSERT | AFTER | 1 |
+----+-------------------+---------------+--------------+-----------+
2 rows in set (0.08 sec)

So, both triggers fired once, and both inserted a row in the test_row_count_log. Actually, it's quite funny seeing the -1 value in the row_count column of the row corresponding to the firing of the BEFORE trigger. I mean, being a count, I'd expect the value 0 rather than -1.

For the row corresponding to the AFTER trigger, the value in the row_count value makes more sense. At this point, we can certainly agree that indeed exactly one row is inserted, and so the value is 1 in this case.

When we execute a statement that inserts multiple rows, we can observe another interesting phenomenon:

INSERT
INTO test_row_count
VALUES (2),(3)
;

SELECT *
FROM test_row_count_log
WHERE test_row_count_id in (2,3)
ORDER BY id
;

+----+-------------------+---------------+--------------+-----------+
| id | test_row_count_id | trigger_event | trigger_time | row_count |
+----+-------------------+---------------+--------------+-----------+
| 3 | 2 | INSERT | BEFORE | -1 |
| 4 | 2 | INSERT | AFTER | 1 |
| 5 | 3 | INSERT | BEFORE | 1 |
| 6 | 3 | INSERT | AFTER | 1 |
+----+-------------------+---------------+--------------+-----------+
4 rows in set (0.02 sec)

Even though the statement fires the BEFORE trigger multiple times, the -1 is returned only for the first row that is touched by the statement. For all subsequent rows, ROW_COUNT() returns a 1 for de row_count column.

Emulating the statement level trigger


This behaviour offers the opportunity to emulate a statement-level trigger. Let's add a level column to the test_row_count_log table and rewrite the BEFORE trigger to demonstrate this:

ALTER TABLE test_row_count_log
ADD level enum('ROW','STATEMENT')
DEFAULT 'ROW'
;
DROP TRIGGER bir_test_row_count
;

delimiter //

CREATE TRIGGER bir_test_row_count
BEFORE INSERT ON test_row_count
FOR EACH ROW
BEGIN
DECLARE v_row_count int DEFAULT row_count();
IF v_row_count!=1 THEN
INSERT
INTO test_row_count_log(
test_row_count_id
, trigger_event
, trigger_time
, row_count
, level
) VALUES (
new.id
, 'INSERT'
, 'BEFORE'
, v_row_count
, 'STATEMENT'
);
END IF;
INSERT
INTO test_row_count_log(
test_row_count_id
, trigger_event
, trigger_time
, row_count
, level
) VALUES (
new.id
, 'INSERT'
, 'BEFORE'
, v_row_count
, 'ROW'
);
END;
//

delimiter ;



INSERT
INTO test_row_count (id)
SELECT id + 3
FROM test_row_count
;

SELECT *
FROM test_row_count_log
WHERE test_row_count_id > 3
ORDER BY id
;

+----+-------------------+---------------+--------------+-----------+-----------+
| id | test_row_count_id | trigger_event | trigger_time | row_count | level |
+----+-------------------+---------------+--------------+-----------+-----------+
| 7 | 4 | INSERT | BEFORE | -1 | STATEMENT |
| 8 | 4 | INSERT | BEFORE | -1 | ROW |
| 9 | 4 | INSERT | AFTER | 1 | ROW |
| 10 | 5 | INSERT | BEFORE | 1 | ROW |
| 11 | 5 | INSERT | AFTER | 1 | ROW |
| 12 | 6 | INSERT | BEFORE | 1 | ROW |
| 13 | 6 | INSERT | AFTER | 1 | ROW |
+----+-------------------+---------------+--------------+-----------+-----------+

Using an IF statement to test the value returned by the ROW_COUNT() value, we can detect if the trigger is handling the first row, and if it is, we can do some special work. In this case, we simply insert a row in the test_row_count_log table, marking it by storing the value 'STATEMENT' in the level column.

Some closing notes


The IF statement tests for those cases where the ROW_COUNT() function returns something else than 1, rather than checking explicitly for equality with -1. This is because I suspect that the value of -1 might be a bug, and it wouldn't surprise me if that would be changed to be a 0 in a newer version of the server.

There's another observation that might convince you to be careful with this hack. At least in MySQL versions 5.0.18 and 5.1.7 (the platform I tested), ROW_COUNT() behaves differently when you wrap a procedure around the code that's now inside the triggers. I found that ROW_COUNT() always returns 0 in this case. As far as I can see now, at least three distinct contexts are involved: procedures, triggers and immediate statements. There might be even more of course, but my observations with ROW_COUNT() imply at least these three.

If you plan to use ROW_COUNT() to emulate a statement-level BEFORE trigger, it's probably a good idea to be careful when upgrading, and convince yourself that the code still works in the newer server version.

I also tested this hack for UPDATE and DELETE statements - that worked for me. Of course, if anyone has any new insights, or methods to achieve a statement-level trigger in MySQL, I'd be very interested. Just leave a comment here and tell the world about it.

One of the things that keep intriguing me if it would be possible to emulate an AFTER statement-level trigger. As far as I can see now, this is impossible, as there is now information regarding the total number of rows that is handled by the statement that fires the triggers. Of course, If you know a hack that does this, I'd be most grateful to hear about it.

2 comments:

Anonymous said...

From MySQL Reference Manual

" row_count() === C API mysql_affected_rows()

PROTO
my_ulonglong mysql_affected_rows(MYSQL *mysql)
Return Values

An integer greater than zero indicates the number of rows affected or
retrieved.
Zero indicates that no records were updated for an UPDATE statement,
no rows matched the WHERE clause in the query or that no query has yet
been executed.
(-1) indicates that the query returned an error or that, for a
SELECT query, mysql_affected_rows() was called prior to calling
mysql_store_result()
. Because mysql_affected_rows() returns an
unsigned value, you can check for -1 by comparing the return value to
(my_ulonglong)-1 (or to (my_ulonglong)~0, which is equivalent "

-> THIS includes triggers before INSERT/UPDATE/REPLACE/DELETE statement!!!

Example:
shell>> mysql --user=root --password test
-- log-in OK+
mysql>> select row_count();
-- returns 0
mysql>> show tables;
OR
mysql>> select * from foo;
Query OK,.................
mysql>> select row_count();
-- returns -1 AND this is right!
-- the same for before triggers

Best regards: Nick,
web: myeraser.fortunecity.com

Unknown said...

No sir.
row_count() is returning -1 in the first line, that's why is inserting STATEMENT,
make it v_row_count > 1 instead of v_row_count !=1 , then see the results yourself.
That time it will never insert STATEMENT in your table.

DuckDB bag of tricks: Processing PGN chess games with DuckDB - Rolling up each game's lines into a single game row (6/6)

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is the sixth installment of a s...