Roland Bouman's blog: mysql stored routine

Showing posts with label mysql stored routine. Show all posts

Thursday, December 01, 2011

Common Schema: dependencies routines

Are you a MySQL DBA? Checkout the common_schema project by Oracle Ace Shlomi Noach.

The common_schema is an open source MySQL schema that packs a number of utility views, functions and stored procedures. You can use these utilities to simplify MySQL database administration and development. Shlomi just released revision 178, and I'm happy and proud to be working together with Shlomi on this project.

Among the many cool features created by Shlomi, such as foreach, repeat_exec and exec_file, there are a few %_dependencies procedures I contributed:

get_event_dependencies(schema_name, event_name)

get_routine_dependencies(schema_name, routine_name)

get_sql_dependencies(sql, default_schema)

get_view_dependencies(schema_name, view_name)

All these procedures return a resultset that indicates which schema objects are used by the object identified by the input parameters. Here are a few examples that should give you an idea:


mysql> call common_schema.get_routine_dependencies('common_schema', 'get_routine_dependencies');
+---------------+----------------------+-------------+--------+
| schema_name   | object_name          | object_type | action |
+---------------+----------------------+-------------+--------+
| common_schema | get_sql_dependencies | procedure   | call   |
| mysql         | proc                 | table       | select |
+---------------+----------------------+-------------+--------+
2 rows in set (0.19 sec)

Query OK, 0 rows affected (0.19 sec)

mysql> call common_schema.get_routine_dependencies('common_schema', 'get_sql_dependencies');
+---------------+-------------------+-------------+--------+
| schema_name   | object_name       | object_type | action |
+---------------+-------------------+-------------+--------+
| common_schema | _get_sql_token    | procedure   | call   |
| common_schema | _sql_dependencies | table       | create |
| common_schema | _sql_dependencies | table       | drop   |
| common_schema | _sql_dependencies | table       | insert |
| common_schema | _sql_dependencies | table       | select |
+---------------+-------------------+-------------+--------+
5 rows in set (1.59 sec)

Of course, there's always a lot to be desired. The main shortcomings as I see it now is that the dependencies are listed only one level deep: that is, the dependencies are not recursively analyzed. Another problem is that there is currently nothing to calculate reverse dependencies (which would arguably be more useful).

The good news is, this is all open source, and your contributions are welcome! If you're interested in the source code of these routines, checkout the common_schema project, and look in the common_schema/routines/dependencies directory.

If you'd like to add recursive dependencies, or reverse dependencies, then don't hesitate and contribute. If you have a one-off contribution that relates directly to these dependencies routines, then it's probably easiest if you email me directly, and I'll see what I can do to get it in. If you are interested in more long term contribution, it's probably best if you write Shlomi, as he is the owner of the common_schema project.

You can even contribute without implementing new features or fixing bugs. You can simply contribute by using the software and find bugs or offer suggestions to improve it. If you found a bug, or have an idea for an improvement or an entirely new feature, please use the issue tracker.

For now, enjoy, and untill next time.

Wednesday, August 24, 2011

Re-implementing udf_init_error in MySQL 5.5 and up

To whom it may concern -

Today, I received an email from a user of the udf_init_error UDF (which resides in the lib_mysqludf_udf library). The purpose of this UDF is to generate an error condition, which can be used to abruptly terminate a trigger or stored procedure. As such it is a workaround for bug #11661. This is all described extensively in my now ancient article here.

The user wrote me because of a problem experienced in MySQL 5.5:

...calling
select udf_init_error('Transaction Cannot Be Done Because....');
will return user friendly error message:
Transaction Cannot Be Done Because....
. But in MySQL 5.5, it returns
Can't initialize function 'udf_init_error; Transaction Cannot Be Done Because....
The Can't initialize function 'udf_init_error; bit is so annoying! How can I get rid of that?

I explained that the UDF still works like it should; it's just that at some point during the 5.0 lifecycle, the format of the error message was changed. (I can't recall exactly which version that was, but I did file bug #38452 that describes this issue).

Anyway, I suggested to move away from using the udf_init_error() UDF, and port all dependent code to use the SIGNAL syntax instead, which was introduced in MySQL 5.5. (For a friendly introduction to using the SIGNAL syntax, please check out one of my prior articles).

Unfortunately, for this particular user this would not be an easy task:

The use of SIGNAL did come to my mind, but the implementation is not easy. I have thousands of stored routines to modify. Besides, I'm already satisfied with what the UDF does.

On the one hand, It makes me happy to hear the udf_init_error() UDF served him so well that he wrote so many routines that rely on it; on the other hand, I feel bad that this is holding him back from upgrading to MySQL 5.5.

For everybody that is in this same position, I'd like to suggest the following solution: simply re-implement udf_init_error() as a stored SQL function that uses the SIGNAL functionality instead. The error message returned to the client will not be exactly the same as in the olden MySQL 5.0 days, but at least there will not be an annoying complaint about a UDF that cannot be initialized.

Here's a very simple example that illustrates how to do it:

CREATE FUNCTION udf_init_error(
   p_message VARCHAR(80)
)
RETURNS INTEGER
DETERMINISTIC
NO SQL
BEGIN
   DECLARE err CONDITION FOR SQLSTATE '45000';
   SIGNAL err SET MESSAGE_TEXT = p_message;
   RETURN 1;
END;

I hope this helps.

Wednesday, April 14, 2010

MySQL Conference 2010 Presentation: Optimizing Stored Routines

Yesterday I delivered my presentation for the MySQL User Conference and Expo 2010: Optimizing MySQL Stored Routines. If you are interested in the slides, you can find them on both the MySQL conference site as well as on slideshare.net. Here's the abstract of my presentation so you can decide if this is interesting for you:

MySQL stored routines (functions, procedures, triggers and events) can be useful. But many casually written stored routines are unnecessarily slow. The main reason is that MySQL does not apply even simple code optimizations to stored routine code. Many developers are not aware of this, and as a result, write stored routine code that can quite easily be tuned, increasing performance by 50%-100% by only applying very straightforward code optimizations.

It was very pleased to see so many people attend: I had the impression that MySQL stored routines are quite impopular, due to performance issues, and a syntax that is often regarded as "clunky", so I didn't expect more than about 20 people to show up. Much to my pleasure, the ballroom was filled for about two-thirds, and I estimate there were 70-something people in the room.

A quick survey of the audience indicated that all of them were in fact using stored routines in production, so I assume they didn't show up out of morbid curiosity :) Interestingly, only few people reported performance issues. It would be interesting to do more research to find out what people are in fact doing with MySQL stored routines. Among yesterday's attendees, there were people using MySQL stored routines for managing user privileges, processing astronomical data, and checking complex dynamic business rules. To be sure - these were all different users - not just one isolated fanatic going wild with stored routines.

Co-incidentally, Domas Mituzas from facebook also mentioned stored routines in his presentation on high concurrency MySQL as a way to reduce the lock gap when performing multiple changes in a single transaction. I'm just saying - perhaps MySQL stored routines aren't that bad at all, they just need more love and dedication from the MySQL developers so they can mature and gain wider applicability.

Recently, I already wrote about a recent improvement in MySQL 5.5, the long anticipated SIGNAL /RESIGNAL syntax. I hope more improvements will follow soon now the dust is settling after Oracle's acquisition of Sun. After hearing Edward Screven unfold Oracle's strategy for MySQL in yesterday's keynote, I can tell you without reservation that I am quite optimistic :)

Anyway - that is all for now. Two days of conference ahead :)

Wednesday, February 17, 2010

MySQL - the best stored routine is the one you don't write

At Fosdem 2010, already two weeks ago, I had the pleasure of hearing Geert van der Kelen explain the work he has been doing on connecting MySQL and Python. I don't know anything about Python, but anybody that has the courage, perseverance and coding skills to create an implementation of the the MySQL wire protocol from scratch is a class-A programmer in my book. So, I encourage everyone that needs MySQL connectivity for Python programs to check out Geert's brainchild, MySQL Connector/Python.

In relation to MySQL Connector/Python, I just read a post from Geert about how he uses the MySQL information_schema to generate some Python code. In this particular case, he needs the data from the COLLATIONS table to maintain a data structure that describes all collations supported by MySQL.

For some reasons that I cannot fathom, Geert needed to generate a structure for each possible collation, not just the ones for which the COLLATIONS table contains a row. To do this, he wrote a stored procedure that uses a cursor to loop through the COLLATIONS table. In the loop, he detects it whenever there's a gap in the sequence of values from the ID column, and then starts a new loop to "fill the gaps". For each iteration of the outer cursor loop, a piece of text is emitted that conforms to the syntax of a Python tuple describing the collation, and each iteration of the inner loop generates the text None, a Python built-in constant.

The final result of the procedure is a snippet of Python code shown below (abbreviated):


..
("cp1251","cp1251_bulgarian_ci"), # 14
("latin1","latin1_danish_ci"), # 15
("hebrew","hebrew_general_ci"), # 16
None,
("tis620","tis620_thai_ci"), # 18
("euckr","euckr_korean_ci"), # 19
..

In the final code, these lines are themselves used to form yet another tuple:


desc = (
    None,
    ("big5","big5_chinese_ci"), # 1
    ("latin2","latin2_czech_cs"), # 2
    ("dec8","dec8_swedish_ci"), # 3
    ("cp850","cp850_general_ci"), # 4
..

This is excellent use of the information schema! However, I am not too thrilled about using a stored routine for this. Enter my fosdem talk about refactoring stored routines.

In this case, performance is not really an issue, so I won't play that card. But many people that do need well-performing stored procedures might start out like Geert and write a cursor loop, and perhaps do some looping inside that loop. One of the big take-aways in my presentation is to become aware of the ways that you can avoid a stored procedure. Geerts procedure is an excellent candidate to illustrate the point. As a bonus, I'm adding the code that is necessary to generate the entire snippet, not just the collection of tuples inside the outer pair of parenthesis.

So, here goes:


set group_concat_max_len := @@max_allowed_packet;

select      concat('desc = (',
                group_concat('\n   '
                ,   if( collations.id is null, 'None',   
                        concat('(', '"', character_set_name, '"',
                            ',', '"', collation_name, '"', ')')
                    )
                ,   if(ids.id=255, '', ','), ' #', ids.id
                order by ids.id
                separator ''    
                ), '\n)'
            )
from       (select (t0.id<<0) + (t1.id<<1) + (t2.id<<2)
            +      (t3.id<<3) + (t4.id<<4) + (t5.id<<5)
            +      (t6.id<<6) + (t7.id<<7)            id
            from   (select 0 id union all select 1) t0
            ,      (select 0 id union all select 1) t1
            ,      (select 0 id union all select 1) t2
            ,      (select 0 id union all select 1) t3
            ,      (select 0 id union all select 1) t4
            ,      (select 0 id union all select 1) t5
            ,      (select 0 id union all select 1) t6
            ,      (select 0 id union all select 1) t7) ids
left join   information_schema.collations on ids.id = collations.id;

This query works first by generating 256 rows having id's ranging from 0 to 255. (I think I recall Alexander Barkov mentioning that this is currently the maximum number of collations that MySQL supports - perhaps I am wronge there). This is done by cross-joining a simple derived table that generates two rows:


(select 0 id union all select 1)

So, one row that yields 0, and one that yields 1. By cross-joining 8 of these derived tables, we get 2 to the 8th power rows, which equals 256. In the SELECT-list, I use the left bitshift operator << to shift the original 0 and 1 0, 1, 2 and so on up to 7 positions. By then adding those values together, we fill up exactly one byte, and gain all possible values from 0 through 255:


(select (t0.id<<0) + (t1.id<<1) + (t2.id<<2)
 +      (t3.id<<3) + (t4.id<<4) + (t5.id<<5)
 +      (t6.id<<6) + (t7.id<<7)            id
 from   (select 0 id union all select 1) t0
 ,      ...                              t1
 ,      ...
 ,      (select 0 id union all select 1) t7) ids

Once we have this, the rest is straightforward - all we have to do now is use a LEFT JOIN to find any collations from the information_schema.COLLATIONS table in case the value of its ID column matches the value we computed with the bit-shifting jiggery-pokery. For the matching rows, we use CONCAT to generate a Python tuple describing the collation, and for the non-matching rows, we generate None:


if( collations.id is null, 'None',   
    concat('(', '"', character_set_name, '"',
           ',', '"', collation_name, '"', ')')
)

The final touch is a GROUP_CONCAT that we use to bunch these up into a comma separated list that is used as entries for the outer tuple. As always, you should set the value of the group_concat_max_len server variable to a sufficiently high value to hold the contents of the generated string, and if you want to be on the safe side and not run the risk of getting a truncated result, you should use max_allowed_packet.

I have the honour of speaking at the MySQL user conference, april 12-15 later this year. There, I will be doing a related talk called Optimizing MySQL Stored Routines. In this talk, I will explain how stored routines impact performance, and provide some tips on how you can avoid them, but also on how to improve your stored procedure code in case you really do need them.