Showing posts with label xsltproc. Show all posts
Showing posts with label xsltproc. Show all posts

Tuesday, April 20, 2010

Restoring XML-formatted MySQL dumps

EAVB_VFZUHIARHI To whom it may concern -

The mysqldump program can be used to make logical database backups. Although the vast majority of people use it to create SQL dumps, it is possible to dump both schema structure and data in XML format. There are a few bugs (#52792, #52793) in this feature, but these are not the topic of this post.

XML output from mysqldump

Dumping in XML format is done with the --xml or -X option. In addition, you should use the --hex-blob option otherwise the BLOB data will be dumped as raw binary data, which usually results in characters that are not valid, either according to the XML spec or according to the UTF-8 encoding. (Arguably, this is also a bug. I haven't filed it though.)

For example, a line like:

mysqldump -uroot -pmysql -X --hex-blob --databases sakila
dumps the sakila database to the following XML format:

<?xml version="1.0"?>
<mysqldump xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<database name="sakila">
<table_structure name="actor">
<field Field="actor_id" Type="smallint(5) unsigned" Null="NO" Key="PRI" Extra="auto_increment" />
<field Field="first_name" Type="varchar(45)" Null="NO" Key="" Extra="" />
<field Field="last_name" Type="varchar(45)" Null="NO" Key="MUL" Extra="" />
<field Field="last_update" Type="timestamp" Null="NO" Key="" Default="CURRENT_TIMESTAMP" Extra="on update CURRENT_TIMESTAMP" />
<key Table="actor" Non_unique="0" Key_name="PRIMARY" Seq_in_index="1" Column_name="actor_id" Collation="A" Cardinality="200" Null="" Index_type="BTREE" Comment="" />
<key Table="actor" Non_unique="1" Key_name="idx_actor_last_name" Seq_in_index="1" Column_name="last_name" Collation="A" Cardinality="200" Null="" Index_type="BTREE" Comment="" />
<options Name="actor" Engine="InnoDB" Version="10" Row_format="Compact" Rows="200" Avg_row_length="81" Data_length="16384" Max_data_length="0" Index_length="16384" Data_free="233832448" Auto_increment="201" Create_time="2009-10-10 10:04:56" Collation="utf8_general_ci" Create_options="" Comment="" />
</table_structure>
<table_data name="actor">
<row>
<field name="actor_id">1</field>
<field name="first_name">PENELOPE</field>
<field name="last_name">GUINESS</field>
<field name="last_update">2006-02-15 03:34:33</field>
</row>

...many more rows and table structures...

</database>
</mysqldump>
I don't want to spend too much time discussing why it would be useful to make backups in this way. There are definitely a few drawbacks - for example, for sakila, the plain SQL dump, even with --hex-blob is 3.26 MB (3.429.358 bytes), whereas the XML output is 13.7 MB (14,415,665 bytes). Even after zip compression, the XML formatted dump is still one third larger than the plain SQL dump: 936 kB versus 695 kB.

Restoring XML output from mysqldump

A more serious problem is that MySQL doesn't seem to offer any tool to restore XML formatted dumps. The LOAD XML feature, kindly contributed by Erik Wetterberg could be used to some extent for this purpose. However, this feature is not yet available (it will be available in the upcoming version MySQL 5.5), and from what I can tell, it can only load data - not restore tables or databases. I also believe that this feature does not (yet) provide any way to properly restore hex-dumped BLOB data, but I really should test it to know for sure.

Anyway.

In between sessions of the past MySQL users conference I cobbled up an XSLT stylesheet that can convert mysqldump's XML output back to SQL script output. It is available under the LGPL license, and it is hosted on google code as the mysqldump-x-restore project. To get started, you need to download the mysqldump-xml-to-sql.xslt XSLT stylesheet. You also need a command line XSLT processor, like xsltproc. This utility is part of the Gnome libxslt project, and is included in packages for most linux distributions. There is a windows port available for which you can download the binaries.

Assuming that xsltproc is in your path, and the XML dump and the mysqldump-xml-to-sql.xslt are in the current working directory, you can use this command to convert the XML dump to SQL:

xsltproc mysqldump-xml-to-sql.xslt sakila.xml > sakila.sql
On Unix-based systems you should be able to directly pipline the SQL into mysql using

mysql -uroot -pmysql < `xsltproc mysqldump-xml-to-sql.xslt sakila.xml`
The stylesheet comes with a number of options, which can be set through xsltproc's --stringparam option. For example, setting the schema parameter to N will result in an SQL script that only contains DML statements:

xsltproc --stringparam schema N mysqldump-xml-to-sql.xslt sakila.xml > sakila.sql
Setting the data option to N will result in an SQL script that only contains DDL statements:

xsltproc --stringparam data N mysqldump-xml-to-sql.xslt sakila.xml > sakila.sql
. There are additional options to control how often a COMMIT should be issued, whether to add DROP statements, whether to generate single row INSERT statements, and to set the max_allowed_packet size.

What's next?

Nothing much really. I don't really recommend people to use mysqldump's XML output. I wrote mysqldump-x-restore for those people that inherited a bunch of XML formatted dumps, and don't know what to do with them. I haven't thouroughly tested it - please file a bug if you find one. If you actually think it's useful and you want more features, please let me know, and I'll look into it. I don't have much use for this myself, so if you have great ideas to move this forward, I'll let you have commit access.

That is all.

Tuesday, July 14, 2009

open-msp-viewer: Free XSLT utilities to render MS Project files as HTML web pages

IMPORTANT NOTICE

The old xslt-based open-msp-viewer project is no longer updated. Please find a new and much improved pure HTML version here at github: https://github.com/rpbouman/open-msp-viewer. Thank you for your interest.

Original post below

For my day job, I've been working on a few things that allow you to render Microsoft Project 2003 projects on a web page.

The code I wrote for my work is proprietary, and probably not directly useful for most people. But I figured that at least some of the work might be useful for others, so I wrote an open source version from scratch and I published that as the open-msp-viewer project on google code. If you like, check out the code and give it a spin.

It works by first saving the project in the MS Project XML format using standard MS Project functionality (Menu \ Save As..., then pick .XML) and then applying an XSLT transformation to generate HTML.

Currently, the project includes an xslt stylesheet that renders MS Project XML files as a Gantt chart. To give you a quick idea, Take a look at these screenshots:
msp1
and
msp2
The web gantt chart is rendered in a HTML 4.01 variant, CSS 2.1 and uses javascript to allow the user to collapse and/or expand individual tasks in the work breakdown structure. Currently, the HTML does not validate due to a few custom attributes I introduced to support dynamic collapsing/expanding the chart with javascript. In addition, the xslt transform process introduces the msp namespace into the result document, which results in a validation error

You can either associate the xslt stylesheet directly with the MS Project XML file, or you can use an external tool like xsltproc.

In the trunk/xml subdirectory, you can find a couple of sample projects in xml format that already have the stylesheet association. I have tested these in IE8, Chrome 2, Safari 4 and Firefox 3.5, and it works well in all these browsers. In the trunk/html directory, you'll find HTML output as created by xsltproc.

In the future, more xslt stylesheets may be added to support alternative views. Things that I think I will add soon are a resources list and a calendar view.

Enjoy, and let me know if you find a bug or would like to contribute.

DataZen winter meetup 2025

The DataZen winter meetup 2025 is nigh! Join us 18 - 20 February 2025 for 3 days of expert-led sessions on AI, LLM, ChatGPT, Big Data, M...