Locations of visitors to this page Book: "Pentaho Kettle Solutions", Matt Casters, Roland Bouman, & Jos van Dongen, Wiley 2010 Book: "Pentaho Solutions", Roland Bouman & Jos van Dongen, Wiley 2009

Friday, February 01, 2008

MySQL Information Schema Plugins: the best kept secret of MySQL 5.1

MySQL 5.1 offers an extremely useful feature called information schema plug-ins. This feature allows dynamic runtime loading of a shared library into the MySQL server to implement a table in the information_schema database. The SQL standard (ISO/IEC 9075-11:2003) allows database implementations to extend the information_schema. MySQL 5.1 transfers the possibility to do this directly to privileged database users so they can extend the information_schema themselves, in any way they see fit.

In this article, we will demonstrate how to create a minimal "Hello, World!" MySQL information schema plugin. In a forthcoming article, we'll demonstrate how information schema plugins may be used to report some of the server's internals such as the contents of the query cache, session level objects such as the currently defined TEMPORARY tables, user-defined variables and SAVEPOINTs.

The MySQL Plug-in API


Information Schema plug-ins are a subfeature of the MySQL plug-in API.
The plug-in API is one of the new features in the upcoming release of the MySQL database server, MySQL 5.1 (which is currently a release candidate). In essence, the MySQL plugin API provides a generic extension point to the MySQL server. It allows users to load a shared library in order to add new functionality to the server.

Plug-ins can be loaded and unloaded using the MySQL specific INSTALL PLUGIN and UNINSTALL PLUGIN syntax respectively. A key feature is that this process is completely dynamic - the server need not be re-compiled and need not be stopped in order to benefit from the functionality of a new plugin. Hence, new functionality can be added without suffering any downtime.

In some respects, the new plugin feature resembles the since long supported user-defined function (UDF) feature. Both plugins and UDFs involve dynamically loading a shared library to extend the server's functionality. Like UDFs, plug-ins are usually written in C/C++. The difference between UDFs and plug-ins is that the UDF feature can be used only for adding new functions to use within the server's SQL dialect. The concept of a plug-in is more broadly applicable and can be used to extend the server in more ways.

Currently, plug-ins are not supported on Microsoft Windows. MySQL is working to lift this limitation but it is at present unclear when this feature will be available for windows.

Currently supported plug-in types


Currently, the MySQL plug-in API supports the following types of plugins:
  • Storage engine: can be used to implement special-purpose row stores for data, which can then be accessed through SQL. Arguably, the Pluggable Storeage Engine Interface is one of the key benefits of MySQL 5.1
  • Full-Text parser: can be used for custom indexing of text-data as well as specialized handling of FULLTEXT query-expressions.
  • Daemon: a daemon plug-in can be used to execute a background process.
  • Information Schema table: are used to implement a 'virtual' table in the MySQL information_schema to report the status of for example the operating system or the server's internals
Currently the plug-in API does not provide support for UDFs but it is expected that in due time, the current user-defined function feature will be merged into the plug-in API.

A closer look at the plug-in API


(Note: in this article I will repeatedly refer to a number of C/C++ header and source files that are part of the MySQL 5.1.22-rc source distribution. Any path that starts with mysql-5.1.22-rc/ should be taken to mean the corresponding path beneath the root of an unpacked MySQL 5.1.22-rc source distribution.)

For all plug-in types, the interface comprises a generic plug-in description structure, which is a struct called st_mysql_plugin. This struct is defined in the plugin.h header file, located in the mysql-5.1.22-rc/include/mysql directory.

The declaration of this structure is as follows:

/*
Plugin description structure.
*/

struct st_mysql_plugin
{
int type; /* the plugin type (a MYSQL_XXX_PLUGIN value) */
void *info; /* pointer to type-specific plugin descriptor */
const char *name; /* plugin name */
const char *author; /* plugin author (for SHOW PLUGINS) */
const char *descr; /* general descriptive text (for SHOW PLUGINS ) */
int license; /* the plugin license (PLUGIN_LICENSE_XXX) */
int (*init)(void *); /* the function to invoke when plugin is loaded */
int (*deinit)(void *);/* the function to invoke when plugin is unloaded */
unsigned int version; /* plugin version (for SHOW PLUGINS) */
struct st_mysql_show_var *status_vars;
struct st_mysql_sys_var **system_vars;
void * __reserved1; /* reserved for dependency checking */
};
Through this structure, the plug-in implementor provides the following things:

  • A number of data members that convey some meta data about the plug-in. These members are used to list the plug-in in the information_schema.PLUGINS system view once it is loaded. Among these is the member char *name. The value that the plug-in implementor provides for the name member is the name that is to be used for the INSTALL PLUGIN and UNINSTALL PLUGIN SQL syntax. For information schema plug-ins, the name member is also used as the table name of the information schema table implemented by the plug-in.

  • Optionally, an array of system variables through which the plug-in may be controlled, and an array of status variables so the plug-in may report its (dynamic) status to the server.

  • pointers to functions that are called upon when the plug-in is loaded (the plugin_init function) and unloaded (the plugin_deinit function)

  • A pointer to an object or structure that provides the actual functionality to implement a plug-in of the specific type

The type-dependent part of a plug-in implementation is usally a struct containing a number of function pointers ('hooks') which are called in a particular sequence that is appropriate for that particular type of plug-in.

Information Schema Plugins


So what is an information_schema plug-in exactly? Well, the information_schema is a virtual database which is primarily intended as a meta data facility: it acts as a container for a collection of read-only 'tables' or rather system views that serve to provide data about the database server itself. As such, it is defined as part of the SQL standard (ISO/IEC 9075-11:2003).

The SQL standard describes a number of views that should appear in the information_schema, and mysql provides partial built-in support for these standard information schema views. The standard also expressly allows database implementors to extend the information_schema by adding new views, or extending the specified tables by adding columns. MySQL information schema plug-ins simply form an interface to allow privileged database users to extend the information schema themselves by writing their own information schema table implementations.

The type-specific API for Information Schema plug-ins


The type-specific part of the plug-in API for information schema plug-ins is formed by the struct called ST_SCHEMA_TABLE, which is defined in mysql-5.1.22-rc/sql/table.h.

When the plug-in is loaded, the server calls the plugin_init function associated with the int (*init)(void *) member of the st_mysql_plugin struct. When called, the server passes a pointer to an instance of a ST_SCHEMA_TABLE struct to the init function. Inside the plugin_init function, it is expected that the plug-in implementor initializes two members of that structure:
  • fields_info - an array of ST_FIELD_INFO structures. Each entry in this array corresponds to a column in the information schema table. The ST_FIELD_INFO structure is also declared in mysql-5.1.22-rc/sql/table.h
  • fill_table - a pointer to a function that is called whenever the server wants to obtain rows from the information_schema table
The ST_SCHEMA_TABLE struct has more members still, but only these two need to be initialized by the plug-in implementor. As long as the plug-in is loaded, the information_schema contains a new table that has a definition corresponding to the column definitions provided by the array assigned to the fields_info member. Whenever that table is queried, the function assigned to the fill_table member is called to actually construct the table's rows.

These two elements really is all there is to the specific API for information schema plug-ins.

Writing an Information Schema Plugin

In this section, I will demonstrate how to write a minimal "Hello World!" MySQL information schema plugin. If you like, you can download the C++ source code.

Apart from a text editor, writing a simple, bare bones information schema plug-in requires no more than a C++ compiler and a number of MySQL's C/C++ header files.

The following examples assume a GNU/Linux environment, a simple text editor, the g++ compiler and the MySQL header files. In order to get the necessary header files it is best to obtain a MySQL source distribution.

(For the purpose of this article, we will assume MySQL 5.1.22-rc. Until MySQL 5.1 is generally available, you are advised to always compile your plug-ins using only the header files shipped with the version of the product whereto the plug-in will be deployed.)

Creating the source file

First, we need to create a C++ source file. We will assume that the working directory is ~/mysql_is_hello/, and that the source file is called mysql_is_hello.cc

Includes

In the top of our source file, we need the include the following header files:

#include <mysql_priv.h>
#include <stdlib.h>
#include <ctype.h>
#include <mysql_version.h>
#include <mysql/plugin.h>
#include <my_global.h>
#include <my_dir.h>

The inclusion of mysql/plugin.h is most important, but currently the other includes are pulled in at some point. So far, this set of includes seems to do the trick.

Defining the columns

It was just mentioned that at some point, the plug-in implementor must provide an array of ST_FIELD_INFO structures that defines the columns of the information schema table. For this example, we'll set up a minimal table definition consisting of one VARCHAR(64) column in order to show a "Hello, world" message. If we flash-forward for a moment - this is the structure of the table we'd like to define:

mysql> desc information_schema.MYSQL_HELLO;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| HELLO | varchar(64) | NO | | | |
+-------+-------------+------+-----+---------+-------+
1 row in set (0.00 sec)
In order to achieve that, we need to declare our ST_FIELD_INFO array like this:

static ST_FIELD_INFO mysql_is_hello_field_info[]=
{
{"HELLO", 64, MYSQL_TYPE_VARCHAR, 0, 0, "Hello", NULL},
{NULL, 0, MYSQL_TYPE_NULL, NULL, NULL, NULL, NULL}
};
The last entry of this array is a dummy that serves as a marker for the end of the array. It is very important to always conclude the array with one such entry. Without such a trailing entry, the server does not know where the array ends. This would of course be a very bad thing, and is likely to result in a crash as soon as the plugin is loaded or the information schema table is accessed.

Now, the first ST_FIELD_INFO entry in the array defines the actual column. When we look at the declaration of ST_FIELD_INFO in mysql-5.1.22-rc/sql/table.h we get a better idea of what an individual column definition is made of:

typedef struct st_field_info
{
const char* field_name;
uint field_length;
enum enum_field_types field_type;
int value;
uint field_flags; // Field atributes(maybe_null, signed, unsigned etc.)
const char* old_name;
uint open_method;
} ST_FIELD_INFO;

  • field_name - this is used as column name

  • field_length - for string-type columns, this is the maximum number of characters. Otherwise, it is the 'display-length' for the column

  • field_type - a value drawn from enum_field_types. This enum is declared in mysql-5.1.22-rc/include/mysql_com.h and denotes a data type for the column. For the most part, there seems to be one entry in the enum for each SQL data type, although there seem to be a number of additional entries in the enum.

  • value - Unfortunately, I haven't got the faintest idea what this is supposed to do. It does not seem to be used by any of the built-in information schema tales

  • field_flags - This is used to set column attributes. By default, columns are NOT NULL and SIGNED, and you can deviate from the default by setting the appopriate flags. You can use either one of the flags MY_I_S_MAYBE_NULL and MY_I_S_UNSIGNED or combine them using the bitwise or operator |. Both flags are defined in mysql-5.1.22-rec/sql/table.h.
  • old_name - I believe this is used by the built-in information schema tables to denote the column name for the corresponding SHOW statement. It's not really applicable to plug-ins I guess, but I always provide a value here, usually a lower case version of the value for the name member
  • open_method - this should be one of SKIP_OPEN_TABLE, OPEN_FRM_ONLY or OPEN_FULL_TABLE. I admit I don't really understand when to choose which option here. It seems likely that it defines in what manner the server must interact with the table, but I can't boast any detailed knowledge in this matter. However, I do know that we can simply use SKIP_OPEN_TABLE for this simple example.

Filling the table

We explained that apart from defining the table columns, the plug-in implementor must also provide a function to actually deliver the rows. Before we can implement the actual fill_table function, we first need a forward declaration of a function that stores one single row in an information schema table:

bool schema_table_store_record(THD *thd, TABLE *table);
This function is defined in mysql-5.1.22-rc/sql/sql_show.cc. It's role will become clear in a moment.

Now we can create the actual fill_table function:
int mysql_is_hello_fill_table(
THD *thd
, TABLE_LIST *tables
, COND *cond
)
{
int status;
CHARSET_INFO *scs= system_charset_info;
TABLE *table= (TABLE *)tables->table;

const char *str = "plugin: hello, information_schema!!!";
table->field[0]->store(
str
, strlen(str)
, scs
);

status = schema_table_store_record(
thd
, table
);

return status;
}
The server passes a number of arguments to the fill_table function:

  • THD *thd - this is a pointer to an instance of the thread descriptor class. In practice, you can think of this as a direct handle to the current session. The THD is declared in mysql-5.1.22-dev/sql/sql_class.h.

  • TABLE_LIST *tables - This is a list of st_table objects. The first entry in that list corresponds to the runtime representation of the table we are implementing as it appears in a query. TABLE_LIST and st_table are defined in mysql-5.1.22-rc/sql/table.h. From what I could infer, it seems that in many cases TABLE (which is a typedef of st_table defined in mysql-5.1.22-dev/sql/handler.h) is used rather than st_table.
  • COND - This is used to pass the instance of the Item class that holds the internal representation of the WHERE-clause. This could then be used by the plugin to return only the rows that are required as per the WHERE condition - something like a pushed down condition. We will ignore this argument for now - the plug-in implementor may use it to optimize the fill_table function, but is not required to do so. So ignoring this argument will not lead to wrong results.
The fill method first obtains a handle to the runtime representation of the information schema table proper:

TABLE *table= (TABLE *)tables->table;
Note that TABLE or st_table structure is the runtime representation of a table, i.e. a table as it appears in a SQL query. This is different from the ST_SCHEMA_TABLE structure which is used to define an information schema table.

We then write the message "plugin: hello, information_schema!!!" to the table's column:

char *str = "plugin: hello, information_schema!!!";
table->field[0]->store(
str
, strlen(str)
, scs
);
So, each TABLE structure (a.k.a st_table) has an array of Field instances which is the runtime representations of table's columns. Here, one of the store methods is called on the Field entry at the 0th index in the field array. Note that the Field at the 0th index corresponds to the 1st column in our SQL table.

Like we just saw for ST_SCHEMA_TABLE versus TABLE, the Field class is a runtime representation of a table column, not to be confused with the ST_FIELD_INFO which merely defines a column.

The Field class is declared in mysql-5.1.22-rc/sql/field.h. It declares a number of overloaded methods to store a value into the current row. In this example we use a variant that is appropriate to store a character string:

/* Store functions returns 1 on overflow and -1 on fatal error */
virtual int store(const char *to, uint length,CHARSET_INFO *cs)=0;
Finally, we get to call the schema_table_store_record function. This finalizes the process of storing a row into the information schema table. This method returns either 0 (in case of success) or 1 (in case of error). Because our example plugin stores just one row, we can simply conclude by returning the result of schema_table_store_record as the result of our fill_table function.

In most practical applications, one would likely have a loop to repeatedly store a row in the table, and one would have to interrupt the normal completion of that loop as soon as the schema_table_store_record returns something not equal to 0.

Putting together the plug-in type-specific implementation


At this point we've taken care of the individial elements to implement an information schema table: we created a ST_FIELD_INFO array to define the columns of the table, and we provided an implementation for the fill_table function to actually deliver the rows. What we still need to do though is hook these elements up with the ST_SCHEMA_TABLE structure that forms the type-specific part of an information schema plug-in.

Like mentioned before, a pointer to a ST_SCHEMA_TABLE instance is passed to the plug-in's init_plugin function. We can now implement the init_plugin function for our specfic plug-in and set it up to use our fields_info and fill_table implementations:

static int mysql_is_hello_plugin_init(void *p)
{
ST_SCHEMA_TABLE *schema= (ST_SCHEMA_TABLE *)p;

schema->fields_info= mysql_is_hello_field_info;
schema->fill_table= mysql_is_hello_fill_table;

return 0;
}
As we can see, there is very little to do here - we simply assign our implementations to the appropriate members of the ST_SCHEMA_TABLE instance, and return 0 to indicate a successful initialization.

We can immediately take care of the plugin_deinit function too:

static int mysql_is_hello_plugin_deinit(void *p)
{
return 0;
}
In this case, we can get away with this simple dummy implementation.

In real-world examples, a plug-in might require some resource like memory or a file. In these cases, the plugin_init function would claim these resources, and the plugin_deinit function would be used to free up those resources again.

Putting together the generic plug-in implementation

With the previous step, we concluded the process of creating the plug-in type specific implementation for an information_schema plug-in. The final touch is that we have to hook this up to the generic part of plug-in interface, that is, we have to provide a st_mysql_plugin structure for our plug-in.

Rather than doing so directly, we use the predefined macros mysql_declare_plugin and mysql_declare_plugin_end for that. (Both these macros are defined in mysql-5.1.22-rc/include/mysql/plugin.h):

struct st_mysql_information_schema mysql_is_hello_plugin=
{ MYSQL_INFORMATION_SCHEMA_INTERFACE_VERSION };

mysql_declare_plugin(mysql_is_hello)
{
MYSQL_INFORMATION_SCHEMA_PLUGIN, /* type constant */
&mysql_is_hello_plugin, /* type descriptor */
"MYSQL_HELLO", /* Name */
"Roland Bouman (http://rpbouman.blogspot.com/)", /* Author */
"Says hello.", /* Description */
PLUGIN_LICENSE_GPL, /* License */
mysql_is_hello_plugin_init, /* Init function */
mysql_is_hello_plugin_deinit, /* Deinit function */
0x0010, /* Version (1.0) */
NULL, /* status variables */
NULL, /* system variables */
NULL /* config options */
}
mysql_declare_plugin_end;
We pass mysql_is_hello to the mysql_declare_plugin macro, and end our plug-in descriptor with the mysql_declare_plugin_end. As far as I can see, these macros are there to take care of some plumbing to allow definition of multiple plug-ins within the same source file.

An important element in putting together the plug-in descriptor is the assignment of the plugin_init and plugin_deinit functions which we discussed in the previous section. Assigning them here to the appropriate members of the plug-in descriptor ensures that the server knows what it must do in order to instantiate the plug-in.

Another important element is assigning the plug-in name, which we chose to be MYSQL_HELLO. We already explained that this name is later used in the INSTALL PLUGIN and DEINSTALL PLUGIN syntax, and that it is also used as the table name for the information_schema table.

Building and installing the plugin

Now that we have created the source file we must compile it and then install the plugin into our server.

Compiling the plugin source file

Assuming the source file mysql_is_hello.cc is located in the current working directory and /home/user/mysql-5.1.22-rc is the path to the MySQL 5.1.22 source distribution, the following line can be used to compile the source file:

g++ -DMYSQL_DYNAMIC_PLUGIN -Wall -shared
-I/home/roland/mysql-5.1.22-rc/include
-I/home/roland/mysql-5.1.22-rc/regex
-I/home/roland/mysql-5.1.22-rc/sql
-o mysql_is_hello.so mysql_is_hello.cc

Note that this is all on one line - I added line breaks to make it easier to read. If all goes well, this should result in a shared object file called mysql_is_hello.so.

Apart from the common g++ options -I (include path), -W (warnings, -Wall means "all warnings"), -shared (to compile as a shared, dynamically linkable library), and -o (output file) we see the specific -DMYSQL_DYNAMIC_PLUGIN. The -D option is there to define a constant (like a #define directive). -DMYSQL_DYNAMIC_PLUGIN triggers some conditional compilation magic that allows the relevant plug-in definitions to be exposed so they are visible from a program that dynamically links the shared object file. So, this option is required to make the plug-in pluggable.

The plugin directory

Once we obtained the mysql_is_hello.so shared object file, we must copy it to the plug-in directory of our installed MySQL 5.1.22 binary. Note that it is not necessary to build MySQL from source - you should be able to install the plug-in in a pre-built MySQL installed from a binary distribution.

The exact location of the plug-in directory is dependent upon specific MySQL distribution and configuration. You can find out its current location by querying the value of the plugin_dir system variable:

mysql> show variables like 'plugin_dir';
+---------------+-----------------------------------------+
| Variable_name | Value |
+---------------+-----------------------------------------+
| plugin_dir | /home/roland/mysql-5.1.22-dev/lib/mysql |
+---------------+-----------------------------------------+
1 row in set (0.01 sec)
So in this case, the shared library mysql_is_hello.so should be copied to /home/roland/mysql-5.1.22-dev/lib/mysql.

Installing the plugin

Once the shared library is in place we can install it using the INSTALL PLUGIN syntax:

mysql> install plugin MYSQL_HELLO soname 'mysql_is_hello.so';
Query OK, 0 rows affected (0.00 sec)
Note that we use the name MYSQL_HELLO for the plugin: this is what we defined earlier as the name member of the st_mysql_plugin plug-in descriptor. Likewise, we use mysql_is_hello.so, which is the file name of our shared object file as soname. The plug-in directory is implied - it should not be possible to install a shared library located at any place outside the plug-in directory.

In order to install a plug-in in this manner, the user has to have privileges to INSERT into the mysql.plugin table, or have the SUPER privilege.

There is a common problem that might occur at this point:

ERROR 1127 (HY000): Can't find symbol '_mysql_plugin_interface_version_' in library
If you see a message like this, it is likely that you forgot to include the -DMYSQL_DYNAMIC_PLUGIN option when compiling the plugin. Adding this option to the g++ compile line is required to create a dynamically loadable plug-in.

Verifying installation

We can now check if the plug-in is correctly installed. We do this by querying the PLUGINS table in the information_schema:

mysql> select * from information_schema.plugins
-> where plugin_name = 'MYSQL_HELLO'\G
*************************** 1. row ***************************
PLUGIN_NAME: MYSQL_HELLO
PLUGIN_VERSION: 0.16
PLUGIN_STATUS: ACTIVE
PLUGIN_TYPE: INFORMATION SCHEMA
PLUGIN_TYPE_VERSION: 50122.0
PLUGIN_LIBRARY: mysql_is_hello.so
PLUGIN_LIBRARY_VERSION: 1.0
PLUGIN_AUTHOR: Roland Bouman (http://rpbouman.blogspot.com/)
PLUGIN_DESCRIPTION: Says hello.
PLUGIN_LICENSE: GPL
1 row in set (0.00 sec)

Using the plug-in

Finally, we get to test our plug-in ;-)

mysql> select * from information_schema.mysql_hello;
+--------------------------------------+
| HELLO |
+--------------------------------------+
| plugin: hello, information_schema!!! |
+--------------------------------------+
1 row in set (0.00 sec)
Of course, this is a gloriously useless application of information schema plug-ins. In a next installment I'll demonstrate that you can do pretty cool stuff with these information schema plug-ins, such as peeking inside the query cache, listing the currently defined savepoints, temporary tables, user variables and much more.

Uninstalling the plugin

When you get tired of the plugin you can uninstall it using the UNINSTALL PLUGIN syntax:

mysql> uninstall plugin MYSQL_HELLO;
Query OK, 0 rows affected (0.00 sec)
Note that currently, due to a bug, you must be sure to use the exact same name for uninstalling the plugin as you did for installing it. I suspect this will be fixed soon, but for now it is best to simply always use the same name, for example the exact name used in the code, MYSQL_HELLO.

Learn More

I will be posting more about information schema plugins shortly. In particular, I will demonstrate how you can report status on server internals such as the query cache to discover which queries are in the cache, the number of blocks they are using and the number of bytes they actually occupying. However, the best way to learn more about extending the MySQL server, the MySQL plug-in API, and the MySQL information_schema is to visit the MySQL user's conference, April 14-17 2008 in Santa Clara CA, USA.

There is a number of great sessions on this and related topics:Of course, once you are at the conference, there will be many MySQL server developers giving you ample opportunity to ask them about some particular details, or maybe have them look at your code. And if you Register by February 26, 2008 you'll save up to $200.

If you can't wait: other great sources of information may be found in the list below:

Cheers, and happy hacking ;-)

7 comments:

mark said...

Phew,

that's an exhaustive article. It'll take me a while to soak up all theinfo - great stuff

Nirbhay Choubey said...

Good write-up, precise & simple, I really enjoyed reading it. Thanks!

Roland Bouman said...

Thanks!

I recently tried this again in MySQL 5.6. Still works!

Salil Jain said...

Quite good .I really enjoyed reading this.But not able to fetch the sample code.Please reply how can I get the sample code

Roland Bouman said...

Salil Jain,

hi! hey, thanks for the comments. I realize the link is broken, but the article is quite old and I do not have this particular source around anymore.

You can try to cut an paste the pieces in a new file and see how far you'll get.

good luck,

Roland

Salil Jain said...

Hi Roland,
Thanks for the quick reply.
Now as you have not discussed anything about main() here so I want to confirm the flow in main(). it will be like
main()
{
init()
fill_table()
deint()
}
Thanks

Roland Bouman said...

Salil,

there is no main function because a plugin is not a standalone program.