TABLE_CONSTRAINTS
table in the MySQL information_schema
. The task was:- Specify a minimal set of columns of the information_schema.TABLE_CONSTRAINTS table that is sufficient to reliably identify a single row in the information_schema.TABLE_CONSTRAINTS table.
- Argue why these columns are necessary and sufficient to identify a row, and why a smaller set of columns does not exist
Short Answer
For MySQL there are two such column sets:
CONSTRAINT_SCHEMA
,TABLE_NAME
,CONSTRAINT_NAME
, andCONSTRAINT_TYPE
TABLE_NAME
,TABLE_SCHEMA
,CONSTRAINT_NAME
, andCONSTRAINT_TYPE
Explanation
According to the ISO SQL specification (ISO/IEC 9075-11:2003 (E) 6.48
TABLE_CONSTRAINTS
base table), we can derive that the "conceptual primary key" of a standard implementation of the TABLE_CONSTRAINTS
system view should be (CONSTRAINT_CATALOG
, CONSTRAINT_SCHEMA
, and CONSTRAINT_NAME
). In the absence of support for catalogs, we can ignore the CONSTRAINT_CATALOG
column, and this also applies to MySQL (all %_CATALOG
columns in the MySQL information_schema
are always NULL
). That would leave us with (CONSTRAINT_SCHEMA
and CONSTRAINT_NAME
) - that is, table constraint names would be unique within a schema. However, in MySQL, constraint names need not be unique within a schema. MySQL supports three types of table constraints:
PRIMARY KEY
UNIQUE
FOREIGN KEY
(that is, they are supported dependent upon the storage engine)
Identifying FOREIGN KEY
constraints
FOREIGN KEY
s are in their own per schema namespace, so at a glance it may seem that the combination of (CONSTRAINT_SCHEMA
and CONSTRAINT_NAME
) is sufficient to identify a FOREIGN KEY
constraint. However, because PRIMARY KEY
and UNIQUE
constraints are not within the same namespace as FOREIGN KEY
constraints, a single schema may contain a FOREIGN KEY
constraint and a non-FOREIGN KEY
constraint that have the same CONSTRAINT_NAME
. From this it follows that we need to take the
CONSTRAINT_TYPE
into account too. So in order to reliably identify a
FOREIGN KEY
constraint, we need to have at least CONSTRAINT_SCHEMA
, CONSTRAINT_NAME
*and* require that CONSTRAINT_TYPE
equals 'FOREIGN KEY'
. (Among the answers was a suggestion that the
CONSTRAINT_TYPE
is not necessary, because the CONSTRAINT_NAME
would provide the necessary differentiation. Although this is true for the identifiers generated by MySQL in the absence of explicit identifiers, it is not true in the general case)Identifying PRIMARY KEY
constraints
So, would (
CONSTRAINT_SCHEMA
, CONSTRAINT_NAME
, and CONSTRAINT_TYPE
) be sufficient to identify any of the other types of table constraints? Unfortunately not. Almost everybody that is somewhat familiar with MySQL knows that in MySQL a
PRIMARY KEY
constraint doesn't really have its own identifier. Rather, the CONSTRAINT_NAME
for a PRIMARY KEY
constraint is always 'PRIMARY'
. In fact, the identifier 'PRIMARY'
is reserved exclusively for PRIMARY KEY
constraints, and may not be used for any other type of constraint. From this, it follows that there can be multiple rows in
information_schema.TABLE_CONSTRAINTS
that have the same combination of values in the columns (CONSTRAINT_SCHEMA
, CONSTRAINT_NAME
, and CONSTRAINT_TYPE
). In fact, for one particular value of CONSTRAINT_SCHEMA
there will be just as many occurrences of the combination ('PRIMARY'
, 'PRIMARY KEY'
) in the columns (CONSTRAINT_NAME
, CONSTRAINT_TYPE
) as there are PRIMARY KEY
constraints in that schema.Of course, we know that there can be only one
PRIMARY KEY
per table. From that, it follows that we can identify a PRIMARY KEY
constraint if we know to which table it belongs. Tables (and views) reside in their own per-schema namespace, so the combination of (
TABLE_SCHEMA
and TABLE_NAME
) is sufficient to identify a table. That means that the combination of (TABLE_SCHEMA
, TABLE_NAME
, and CONSTRAINT_TYPE
) is sufficient to identify a PRIMARY KEY
constraint. Once we identified a particular table using (TABLE_SCHEMA
, TABLE_NAME
) we know that the table constraint with the CONSTRAINT_TYPE
equal to 'PRIMARY KEY'
is the PRIMARY KEY
constraint. Of course, because the name
'PRIMARY'
is reserved exclusively for PRIMARY KEY
constraints we could've equally well settled for (TABLE_SCHEMA
, TABLE_NAME
, CONSTRAINT_NAME
) although personally I prefer to use CONSTRAINT_TYPE
.Identifying UNIQUE
constraints
This leaves us with the
UNIQUE
constraints. UNIQUE
constraints must have a unique name per table, and the name must not be equal to 'PRIMARY'
(which is reserved for PRIMARY KEY
constraints). From this we must conclude that we need no less than the combination of (TABLE_SCHEMA
, TABLE_NAME
, CONSTRAINT_TYPE
, and CONSTRAINT_NAME
) to identify a UNIQUE
constraint. We must have (
TABLE_SCHEMA
, TABLE_NAME
) to identify the table because the UNIQUE
constraints have a unique name per table. Now we can take a shortcut: within one table we can distinguish between PRIMARY KEY
and UNIQUE
constraints by virtue of the fact that the CONSTRAINT_NAME
for PRIMARY KEY
s will always be 'PRIMARY'
. However, if we need to distinguish between FOREIGN KEY
and UNIQUE
constraints, we still need to look at the CONSTRAINT_TYPE
too.Summary
To recapitulate:
- we need
CONSTRAINT_SCHEMA
,CONSTRAINT_NAME
,CONSTRAINT_TYPE
to identify aFOREIGN KEY
constraint - we need
TABLE_SCHEMA
,TABLE_NAME
, andCONSTRAINT_TYPE
to identify aPRIMARY KEY
constraint. - Alternatively we could identify
PRIMARY KEY
constraints by looking atTABLE_SCHEMA
,TABLE_NAME
,CONSTRAINT_NAME
by virtue of the fact that allPRIMARY KEY
s always have aCONSTRAINT_NAME
equal to'PRIMARY'
- we need
TABLE_SCHEMA
,TABLE_NAME
,CONSTRAINT_TYPE
andCONSTRAINT_NAME
to identify aUNIQUE
constraint
From this we could conclude that the minimal set of columns required to identify an arbitrary table constraint consists of:
CONSTRAINT_SCHEMA
CONSTRAINT_NAME
CONSTRAINT_TYPE
TABLE_SCHEMA
TABLE_NAME
However, in MySQL, a table constraint is 'owned' by the table on which it is defined and a table constraint defined on particular table cannot reside in a schema different from the schema its table. This means that for any row in
information_schema.TABLE_CONSTRAINTS
, the columns TABLE_SCHEMA
and CONSTRAINT_SCHEMA
will always have the same value. This brings us to the final conclusion that there are two minimal set of columns that may be used to identify an arbitrary constraint in the
information_schema.TABLE_CONSTRAINTS
table:CONSTRAINT_SCHEMA
,TABLE_NAME
,CONSTRAINT_NAME
,CONSTRAINT_TYPE
TABLE_SCHEMA
,TABLE_NAME
,CONSTRAINT_NAME
,CONSTRAINT_TYPE
As
CONSTRAINT_SCHEMA
and TABLE_SCHEMA
are completely interchangeable as far as the result is concerned, there is no specific reason to generically prefer either combination.To learn more about this and other traps in the MySQL
information schema
, come meet me at the MySQL User's conference. I will be your guide on the Grand Tour of the Information Schema and its Applications, and explain more of these gotcha's.
4 comments:
I am trying to find the PRIMARY_KEY for a table to optimize replication from one host to another.
To find the PRIMARY_KEY, I am using the query:
'SELECT k.column_name
FROM information_schema.table_constraints t
JOIN information_schema.key_column_usage k
USING(constraint_name,table_schema,table_name)
WHERE t.constraint_type='PRIMARY KEY'
AND t.table_schema='db_name'
AND t.table_name='tbl_name';'
However, two tables within information_schema that are taking a long time to process 'SELECT *', 4+ minutes.
All the other tables in this schema respond promptly with 'SELECT *'.
'SELECT * FROM information_schema.TABLE_CONSTRAINTS' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS' = 232
'SELECT * FROM information_schema.KEY_COLUMN_USAGE' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.KEY_COLUMN_USAGE' = 291
Ideas?
Thanks
Jeff in Seattle
I am trying to find the PRIMARY_KEY for a table to optimize replication from one host to another.
To find the PRIMARY_KEY, I am using the query:
'SELECT k.column_name
FROM information_schema.table_constraints t
JOIN information_schema.key_column_usage k
USING(constraint_name,table_schema,table_name)
WHERE t.constraint_type='PRIMARY KEY'
AND t.table_schema='db_name'
AND t.table_name='tbl_name';'
However, two tables within information_schema that are taking a long time to process 'SELECT *', 4+ minutes.
All the other tables in this schema respond promptly with 'SELECT *'.
'SELECT * FROM information_schema.TABLE_CONSTRAINTS' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS' = 232
'SELECT * FROM information_schema.KEY_COLUMN_USAGE' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.KEY_COLUMN_USAGE' = 291
Ideas?
Thanks
Jeff in Seattle
Hi Jeff,
your query looks good, you can try to see if things speed up by putting everything in the JOIN condition:
INNER JOIN
ON tcs.table_schema = '[the schema]'
AND tcs.table_name = '[the table]'
AND tcs.table_schema = kcu.table_schema
AND tcs.table_name = kcu.table_name
AND tcs.constraint_type = 'PRIMARY KEY'
That said, information_schema performance is sadly quite poor in some cases: see http://bugs.mysql.com/bug.php?id=19588
There is some improvement in 5.1
Does your replication optimization require instant information on the primary key? In that case it's probably better build some kind of cache for the metadata you need, or obtain it in a different way.
(I know, that sucks. But sadly that's all I can think of. Please add your comment to the bug report - MySQL needs to know people are suffering from the current performance problems)
Figured it out:
information_schema.TABLE_CONSTRAINTS AS `tcs`
information_schema.KEY_COLUMN_USAGE AS `kcu`
Post a Comment