Views are a great feature which is available as of MySQL 5. A view can be seen as a named and persistently stored
SELECT
expression. After defining the view it can be referenced just like a table in other queries.
Metadata for MySQL Views can be queried using the
TABLES
and
VIEWS
tables from the
information_schema
. This can be very helpful when maintaining or developing views.
However, this metadata would be even more valueable if it would provide a few more items:
- Status
- A view becomes broken when the database objects on which it depends (tables, functions, other views) are modified or dropped. It would be nice to be able to run a query to identify exactly those views.
- The original sourcecode
- Right now, the view source is parsed and becomes rewritten, stripping out any comments and indentation. The information_schema only provides access to this rewritten statement in the
VIEW_DEFINITION
column of the information_schema.VIEWS
table; an unpleasant suprise if you spent a lot of time commenting your view code.
- Algorithm
- In the
CREATE VIEW
statement, one can specify the Algorithm that should be used to implement the view. Because the algorithm can have a big impact on the performance it would be nice to be able to query that.
The View Status
Although it's not explicitly supported, it's actually possible to create a conventional query that returns the status for a given view (or set of views). The solution relies on the value of the
TABLE_COMMENT
column of the
information_schema.TABLES
table.
For base tables, this column contains the text specified for the
COMMENT
table option specified in the
CREATE TABLE
or
ALTER TABLE
statement. For views, this normally contains the text
'VIEW'
:
mysql> USE test;
Database changed
mysql> CREATE TABLE t(
-> id int
-> )
-> ENGINE = MyISAM
-> COMMENT = 'Comments on this table'
-> ;
Query OK, 0 rows affected (0.10 sec)
mysql> CREATE VIEW v
-> AS
-> SELECT *
-> FROM t;
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT table_name
-> , table_type
-> , table_comment
-> FROM information_schema.tables
-> WHERE table_schema = schema()
-> AND table_name IN ('t','v')
-> ;
+------------+------------+------------------------+
| table_name | table_type | table_comment |
+------------+------------+------------------------+
| t | BASE TABLE | Comments on this table |
| v | VIEW | VIEW |
+------------+------------+------------------------+
2 rows in set (0.00 sec)
Now, we will deliberately break the view by dropping the table upon which it depends:
mysql> DROP TABLE t;
Query OK, 0 rows affected (0.04 sec)
When we run our select again, this is what we get:
+------------+------------+----------------------------------------------------------------------------------+
| table_name | table_type | table_comment |
+------------+------------+----------------------------------------------------------------------------------+
| v | VIEW | View 'test.v' references invalid table(s) or column(s) or function(s) or defin |
+------------+------------+----------------------------------------------------------------------------------+
So, we're witnessing the fact that MySQL is using this column in it's very own way. Because you cannot specify a
COMMENT
table option when creating or altering a view, MySQL can pretty much decide for itself how to use this column, and it turns out we can take advantage of this.
(Another case where MySQL uses it for it's own means is seen in InnoDB tables. In those cases, some of the tail of the comment is used to log some innodb specific info, possibly truncating the actually entered comment text.)
Now that we know this, it won't be too hard to write a query that identifies broken views:
SELECT *
FROM information_schema.tables
WHERE table_type = 'VIEW'
AND table_type != table_comment
(Later on, we'll modify the query in order to have a
VIEW_STATUS
column report a value of either
'VALID'
or
'INVALID'
, indicating wheter the view is broken or not. Of course, that modification is trivial.)
Under the Hood: Algorithm and Sourcecode
MySQL stores the view definition in a
.frm
file in the database directory beneath the
data
directory, just like it does for tables. Let's take a look at the
v.frm
file: :
TYPE=VIEW
query=select `test`.`t`.`id` AS `id` from `test`.`t`
md5=25ace41c7f6b0927b2c4ed5227e64ad0
updatable=1
algorithm=0
definer_user=root
definer_host=localhost
suid=1
with_check_option=0
revision=1
timestamp=2006-05-09 22:34:46
create-version=1
source=select *\nfrom t
I've highlighted some items of interest:
- the
query
is the code after MySQL's done rewriting the query.
- Here, the
algorithm
item is associated with the number 0
. By creating views with explicit ALGORITHM
specifications, we can quickly learn that this can have either on of the values 0
,1
or 2
, corresponding to the algorithms UNDEFINED
, TEMPTABLE
and MERGE
.
- the
source
item is almost exactly the code as it was entered by the user. The whitespace is still there, only the newlines are escaped with an escape sequence (\n
)
Now that we know that it's stored over there, we can try to use it in a query. To do that, we need to be able to somehow access and parse (or actually scan rather than parse) the file contents...And we can, with the
LOAD_FILE()
function!
Using LOAD_FILE()
The
LOAD_FILE()
function accepts a filename as a string argument. It tries to read the specified file, and returns the contents as a string. You need the
FILE
privilege to do this, and the file must not be larger than
max_allowed_packet
bytes.
On my Windows XP system, the data directory is located at
D:\MySQL\MySQL Server 5.1\data
. The view
v
resides in the
test
database, so I can access the contents of the corresponding
.frm
file using this query:
SELECT load_file(
'D:\\MySQL\\MySQL Server 5.1\\data\\test\\v.frm'
)
;
In a string literal, a backslash introduces an escape sequence for special characters. So, in order to get a literal backslash, the backslash itself needs to be escaped by typing two backslashes. Tip for windows users: LOAD_FILE will also accept forward slashes as directory separator, so this works equally well (better, actually):
SELECT load_file(
'D:/MySQL/MySQL Server 5.1/data/test/v.frm'
);
Now, we want to be able to drive this using data from the
information_schema.VIEWS
table, so the following modification is quite straightforward:
SELECT load_file(
concat(
'D:/MySQL/MySQL Server 5.1/data/'
, table_schema
, '/'
, table_name
, '.frm'
)
) file_contents
FROM information_schema.VIEWS
;
Of course, we would also like to be able to parameterize the value for the data directory. At first, it looks as if you can:
mysql> show variables like '%dir%';
+----------------------------+-------------------------------------------+
| Variable_name | Value |
+----------------------------+-------------------------------------------+
| basedir | D:\MySQL\MySQL Server 5.1\ |
| bdb_logdir | |
| bdb_tmpdir | |
| character_sets_dir | D:\MySQL\MySQL Server 5.1\share\charsets\ |
| datadir | D:\MySQL\MySQL Server 5.1\Data\ |
| innodb_data_home_dir | |
| innodb_log_arch_dir | |
| innodb_log_group_home_dir | .\ |
| innodb_max_dirty_pages_pct | 90 |
| plugin_dir | D:\MySQL\MySQL Server 5.1\lib/ |
| slave_load_tmpdir | D:\MySQL\Datatemp |
| tmpdir | D:/MySQL/Datatemp |
+----------------------------+-------------------------------------------+
12 rows in set (0.00 sec)
But alas, it doesn't work that way:
mysql> select @@datadir;
ERROR 1193 (HY000): Unknown system variable 'datadir'
I'm not really sure if this behaviour constitutes a bug, but it doesn't make much sense to me. A quick search in
bugs.mysql.com did find me a report about this (bug #1039). As I cannot change this behaviour, I'm settling for the literal right now. If anyone's got a tip to improve on this, I'll be most glad to hear about it.
(Actually, the minor improvement I came up with was using a user-defined variable but this got me into trouble again because of an entirely
different bug alltogether)
'Parsing' the file contents
Now that we can load the view's
.frm
file, we can try to pick out the items of interest.
(I wouldn't call it parsing exactly, it's more of a scanning/recognition process)
For example, suppose we want to extract the
algorithm
item from the file contents. First, we'll create a few different views to be able to test the effect:
create algorithm=undefined view v_undefined as select * from t;
create algorithm=merge view v_merge as select * from t;
create algorithm=temptable view v_temptable as select * from t;
Now we can write the actual query. First, we want to look for the string
algorithm=
; then, our value is sitting there right after that string as the remainder of the line. In MySQL, we can solve this 'excision'
SUBSTRING()
and
INSTR()
and the like, but I find that a combination
SUBSTRING_INDEX()
is really much, much more elegant, as well as less error-prone. That's because you don't have to repeat or duplicate
any of the strings that determine the excision: I need only one occurrence of the input string, one occurrence of the starting delimiter and one occurrence of the ending delimiter. (A variant of this theme let's you select the n-th item in a separated list by modifying one integer only). Anyway, this is what I'm doing now:
select v.table_schema as schema_name
, v.table_name as view_name
, substring_index(
substring_index(
load_file(
concat(
'D:/MySQL/MySQL Server 5.1/data/'
, v.table_schema
, '/'
, v.table_name
, '.frm'
)
)
, 'algorithm='
, -1
)
, '\n'
, 1
) as algorithm
from information_schema.views v
where v.table_schema = schema()
and v.table_name like 'v\_%'
and this gives us the following result:
+-------------+---------------+-----------+
| schema_name | view_name | algorithm |
+-------------+---------------+-----------+
| test | v_merge | 2 |
| test | v_temptable | 1 |
| test | v_undefined | 0 |
+-------------+---------------+-----------+
Ok, we can make the output pretty later on, so people can actually read what algorithm was used. We'll take care of that after we're done extracting the original source code.
By itself, excising the souce code is even easier than what we already saw for the
algorithm
item, because the source code happens to be the last item in the
.frm
file. There's just one complicating factor, and that's that some special characters in the orignal statement appear as plain, escaped text there. So, in order to get to the actual source code, you need to apply the
REPLACE()
function. This worked for me:
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
substring_index(
load_file(
concat(
'D:/MySQL/MySQL Server 5.1/data/'
, v.table_schema
, '/'
, v.table_name
, '.frm'
)
)
, '\nsource='
, -1
)
, '\\_','\_')
, '\\%','\%')
, '\\\\','\\')
, '\\Z','\Z')
, '\\t','\t')
, '\\r','\r')
, '\\n','\n')
, '\\b','\b')
, '\\\"','\"')
, '\\\'','\'')
, '\\0','\0')
Putting it together
Now, let's finish what we've started. Suppose we want to create a query that shows
both the algorithm, as well as the original source code. Well, obviously, we can take the expressions for the algorithm and source, just as described before, and
SELECT
both of them. This means we will be loading the file 2 times (That is, unless the optimizer is clever enough to exclude it).
I don't know if it makes a difference performance-wise, but I use a little trick to prevent duplicate loads of the file. I assign the result of
LOAD_FILE()
to a (session) user variable, and reference that wherever I need the file contents. Here's the final query:
select v.table_schema as schema_name
, v.table_name as view_name
, case
substring_index(
substring_index(
@frm:=load_file(
concat(
'D:/MySQL/MySQL Server 5.1/data/'
, v.table_schema
, '/'
, v.table_name
, '.frm'
)
)
, 'algorithm='
, -1
)
, '\n'
, 1
)
when 0 then 'UNDEFINED'
when 1 then 'TEMPTABLE'
when 2 then 'MERGE'
else 'UNRECOGNIZED'
end as algorithm
, replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
replace(
substring_index(
@frm
, '\nsource='
, -1
)
, '\\_','\_')
, '\\%','\%')
, '\\\\','\\')
, '\\Z','\Z')
, '\\t','\t')
, '\\r','\r')
, '\\n','\n')
, '\\b','\b')
, '\\\"','\"')
, '\\\'','\'')
, '\\0','\0') as view_source
, if(t.table_type=table_comment,'YES'
, 'NO'
) as view_status
from information_schema.views v
inner join information_schema.tables t
on v.table_schema = t.table_schema
and v.table_name = t.table_name
This includes the extra modifications to create the
view_status
column and the
algorithm
column.