TABLE_CONSTRAINTS table in the MySQL information_schema. The task was:- Specify a minimal set of columns of the information_schema.TABLE_CONSTRAINTS table that is sufficient to reliably identify a single row in the information_schema.TABLE_CONSTRAINTS table.
- Argue why these columns are necessary and sufficient to identify a row, and why a smaller set of columns does not exist
Short Answer
For MySQL there are two such column sets:
- CONSTRAINT_SCHEMA,- TABLE_NAME,- CONSTRAINT_NAME, and- CONSTRAINT_TYPE
- TABLE_NAME,- TABLE_SCHEMA,- CONSTRAINT_NAME, and- CONSTRAINT_TYPE
Explanation
According to the ISO SQL specification (ISO/IEC 9075-11:2003 (E) 6.48
TABLE_CONSTRAINTS base table), we can derive that the "conceptual primary key" of a standard implementation of the TABLE_CONSTRAINTS system view should be (CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA, and CONSTRAINT_NAME). In the absence of support for catalogs, we can ignore the CONSTRAINT_CATALOG column, and this also applies to MySQL (all %_CATALOG columns in the MySQL information_schema are always NULL). That would leave us with (CONSTRAINT_SCHEMA and CONSTRAINT_NAME) - that is, table constraint names would be unique within a schema. However, in MySQL, constraint names need not be unique within a schema. MySQL supports three types of table constraints:
- PRIMARY KEY
- UNIQUE
- FOREIGN KEY(that is, they are supported dependent upon the storage engine)
Identifying FOREIGN KEY constraints
FOREIGN KEYs are in their own per schema namespace, so at a glance it may seem that the combination of (CONSTRAINT_SCHEMA and CONSTRAINT_NAME) is sufficient to identify a FOREIGN KEY constraint. However, because PRIMARY KEY and UNIQUE constraints are not within the same namespace as FOREIGN KEY constraints, a single schema may contain a FOREIGN KEY constraint and a non-FOREIGN KEY constraint that have the same CONSTRAINT_NAME. From this it follows that we need to take the
CONSTRAINT_TYPE into account too. So in order to reliably identify a
FOREIGN KEY constraint, we need to have at least CONSTRAINT_SCHEMA, CONSTRAINT_NAME *and* require that CONSTRAINT_TYPE equals 'FOREIGN KEY'. (Among the answers was a suggestion that the
CONSTRAINT_TYPE is not necessary, because the CONSTRAINT_NAME would provide the necessary differentiation. Although this is true for the identifiers generated by MySQL in the absence of explicit identifiers, it is not true in the general case)Identifying PRIMARY KEY constraints
So, would (
CONSTRAINT_SCHEMA, CONSTRAINT_NAME, and CONSTRAINT_TYPE) be sufficient to identify any of the other types of table constraints? Unfortunately not. Almost everybody that is somewhat familiar with MySQL knows that in MySQL a
PRIMARY KEY constraint doesn't really have its own identifier. Rather, the CONSTRAINT_NAME for a PRIMARY KEY constraint is always 'PRIMARY'. In fact, the identifier 'PRIMARY' is reserved exclusively for PRIMARY KEY constraints, and may not be used for any other type of constraint. From this, it follows that there can be multiple rows in
information_schema.TABLE_CONSTRAINTS that have the same combination of values in the columns (CONSTRAINT_SCHEMA, CONSTRAINT_NAME, and CONSTRAINT_TYPE). In fact, for one particular value of CONSTRAINT_SCHEMA there will be just as many occurrences of the combination ('PRIMARY', 'PRIMARY KEY') in the columns (CONSTRAINT_NAME, CONSTRAINT_TYPE) as there are PRIMARY KEY constraints in that schema.Of course, we know that there can be only one
PRIMARY KEY per table. From that, it follows that we can identify a PRIMARY KEY constraint if we know to which table it belongs. Tables (and views) reside in their own per-schema namespace, so the combination of (
TABLE_SCHEMA and TABLE_NAME) is sufficient to identify a table. That means that the combination of (TABLE_SCHEMA, TABLE_NAME, and CONSTRAINT_TYPE) is sufficient to identify a PRIMARY KEY constraint. Once we identified a particular table using (TABLE_SCHEMA, TABLE_NAME) we know that the table constraint with the CONSTRAINT_TYPE equal to 'PRIMARY KEY' is the PRIMARY KEY constraint. Of course, because the name
'PRIMARY' is reserved exclusively for PRIMARY KEY constraints we could've equally well settled for (TABLE_SCHEMA, TABLE_NAME, CONSTRAINT_NAME) although personally I prefer to use CONSTRAINT_TYPE.Identifying UNIQUE constraints
This leaves us with the
UNIQUE constraints. UNIQUE constraints must have a unique name per table, and the name must not be equal to 'PRIMARY' (which is reserved for PRIMARY KEY constraints). From this we must conclude that we need no less than the combination of (TABLE_SCHEMA, TABLE_NAME, CONSTRAINT_TYPE, and CONSTRAINT_NAME) to identify a UNIQUE constraint. We must have (
TABLE_SCHEMA, TABLE_NAME) to identify the table because the UNIQUE constraints have a unique name per table. Now we can take a shortcut: within one table we can distinguish between PRIMARY KEY and UNIQUE constraints by virtue of the fact that the CONSTRAINT_NAME for PRIMARY KEYs will always be 'PRIMARY'. However, if we need to distinguish between FOREIGN KEY and UNIQUE constraints, we still need to look at the CONSTRAINT_TYPE too.Summary
To recapitulate:
- we need CONSTRAINT_SCHEMA,CONSTRAINT_NAME,CONSTRAINT_TYPEto identify aFOREIGN KEYconstraint
- we need TABLE_SCHEMA,TABLE_NAME, andCONSTRAINT_TYPEto identify aPRIMARY KEYconstraint.
- Alternatively we could identify PRIMARY KEYconstraints by looking atTABLE_SCHEMA,TABLE_NAME,CONSTRAINT_NAMEby virtue of the fact that allPRIMARY KEYs always have aCONSTRAINT_NAMEequal to'PRIMARY'
- we need TABLE_SCHEMA,TABLE_NAME,CONSTRAINT_TYPEandCONSTRAINT_NAMEto identify aUNIQUEconstraint
From this we could conclude that the minimal set of columns required to identify an arbitrary table constraint consists of:
- CONSTRAINT_SCHEMA
- CONSTRAINT_NAME
- CONSTRAINT_TYPE
- TABLE_SCHEMA
- TABLE_NAME
However, in MySQL, a table constraint is 'owned' by the table on which it is defined and a table constraint defined on particular table cannot reside in a schema different from the schema its table. This means that for any row in
information_schema.TABLE_CONSTRAINTS, the columns TABLE_SCHEMA and CONSTRAINT_SCHEMA will always have the same value. This brings us to the final conclusion that there are two minimal set of columns that may be used to identify an arbitrary constraint in the
information_schema.TABLE_CONSTRAINTS table:- CONSTRAINT_SCHEMA,- TABLE_NAME,- CONSTRAINT_NAME,- CONSTRAINT_TYPE
- TABLE_SCHEMA,- TABLE_NAME,- CONSTRAINT_NAME,- CONSTRAINT_TYPE
As
CONSTRAINT_SCHEMA and TABLE_SCHEMA are completely interchangeable as far as the result is concerned, there is no specific reason to generically prefer either combination.To learn more about this and other traps in the MySQL
information schema, come meet me at the MySQL User's conference. I will be your guide on the Grand Tour of the Information Schema and its Applications, and explain more of these gotcha's.
 
 
 
4 comments:
I am trying to find the PRIMARY_KEY for a table to optimize replication from one host to another.
To find the PRIMARY_KEY, I am using the query:
'SELECT k.column_name
FROM information_schema.table_constraints t
JOIN information_schema.key_column_usage k
USING(constraint_name,table_schema,table_name)
WHERE t.constraint_type='PRIMARY KEY'
AND t.table_schema='db_name'
AND t.table_name='tbl_name';'
However, two tables within information_schema that are taking a long time to process 'SELECT *', 4+ minutes.
All the other tables in this schema respond promptly with 'SELECT *'.
'SELECT * FROM information_schema.TABLE_CONSTRAINTS' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS' = 232
'SELECT * FROM information_schema.KEY_COLUMN_USAGE' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.KEY_COLUMN_USAGE' = 291
Ideas?
Thanks
Jeff in Seattle
I am trying to find the PRIMARY_KEY for a table to optimize replication from one host to another.
To find the PRIMARY_KEY, I am using the query:
'SELECT k.column_name
FROM information_schema.table_constraints t
JOIN information_schema.key_column_usage k
USING(constraint_name,table_schema,table_name)
WHERE t.constraint_type='PRIMARY KEY'
AND t.table_schema='db_name'
AND t.table_name='tbl_name';'
However, two tables within information_schema that are taking a long time to process 'SELECT *', 4+ minutes.
All the other tables in this schema respond promptly with 'SELECT *'.
'SELECT * FROM information_schema.TABLE_CONSTRAINTS' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.TABLE_CONSTRAINTS' = 232
'SELECT * FROM information_schema.KEY_COLUMN_USAGE' takes 3+ minutes
'SELECT COUNT(*) FROM information_schema.KEY_COLUMN_USAGE' = 291
Ideas?
Thanks
Jeff in Seattle
Hi Jeff,
your query looks good, you can try to see if things speed up by putting everything in the JOIN condition:
INNER JOIN
ON tcs.table_schema = '[the schema]'
AND tcs.table_name = '[the table]'
AND tcs.table_schema = kcu.table_schema
AND tcs.table_name = kcu.table_name
AND tcs.constraint_type = 'PRIMARY KEY'
That said, information_schema performance is sadly quite poor in some cases: see http://bugs.mysql.com/bug.php?id=19588
There is some improvement in 5.1
Does your replication optimization require instant information on the primary key? In that case it's probably better build some kind of cache for the metadata you need, or obtain it in a different way.
(I know, that sucks. But sadly that's all I can think of. Please add your comment to the bug report - MySQL needs to know people are suffering from the current performance problems)
Figured it out:
information_schema.TABLE_CONSTRAINTS AS `tcs`
information_schema.KEY_COLUMN_USAGE AS `kcu`
Post a Comment