Saturday, February 21, 2009

Rare find...

First time in my lifetime....

wikiout

And indeed, about a minute later all is back to normal.

My respect goes out to the architects and admins of wikipedia that somehow manage to keep this large scale operation running so smoothly. It's truly amazing.

Friday, February 20, 2009

Exporting a Kettle Repository to Files

Hi All!

Today I'd like to announce KREX, a small solution I put together to export a Kettle (a.k.a. Pentaho Data Integration) Repository to individual transformation (.ktr) and and job (.kjb) files.

The idea to create this was inspired by this thread on the pentaho forums, started by kandrews. He (she?) wrote:
Has anyone ever been able to export a PDI repository and convert it somehow into regular non-repository .kjb & .ktr files? If you have done this already or this functionality already exists please let me know.

My initial thoughts are possibly an XLS translation against the XML from the repository export. Thoughts?
Well, I hope this helps! Enjoy en let me know if its useful. Be advised that in the same thread, Matt Casters already revealed that the functionality to do this will soon be built into PDI, but until then this may be of use.

To start using KREX,

  • checkout the repository or download the Job and Transformation files to your file system.

  • Open the main Job file export_repository_to_files.kjb using Pentaho Data Integration 3.2's spoon (Currently a Milestone 1 release)

  • Configure the Set Source Repository Step in the set_source_repo_and_target_directory transformation to match the repository you want to export

  • Run the main job file (export_repository_to_files.kjb)

If all goes well, you should now have a directory called pdi_repo_export in your home directory which contains a subdirectory named after your exported repository containing the directory tree with the .ktr and .kjb files.

Here's a quick screenshot of the main job, just to give you an idea:
krexThe heart of the job is formed by the very last transformation, which does the actual legwork of extracting and saving the individual transformations:
krex2
The steps before that are mainly configuration and ensuring that the directory tree that is to contain the files is created before we attempt to write any files.

If you have any suggestions or comments, I welcome you to post them here. If you are trying to use KREX but run into an issue, please use the KREX issuelist.

If you are looking for more tips and trick with kettle and Pentaho in general, stay tuned. The "Building Pentaho Solutions" book I'm writing for Wiley together with Jos van Dongen will contain tons and tons of practical tips and solutions, and explain many of its technologies and concepts in thorough detail.

Cheers and until next time,

Roland

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...