Sunday, July 09, 2006

Some thoughts on Pentaho and MySQL

Three weeks have gone by since the presentation of the "Business Intelligence with MySQL and Pentaho" webinar.

People that have been reading this blog will probably know how enthusiastic I am about Pentaho, even before I joined MySQL. Any effort that helps these superb products go together "...like peas and carrots..." is certainly something I like to contribute to as much as I can.

MySQL has been supporting features that make it extraordinarily useful in reporting / datawarehousing environments for quite a long time, and one of the major features of the upcoming 5.1 release, partitioning will probably not be the last one on the list. Pentaho is of course a much younger project, but that makes it all the more impressive if you see what they've achieved so far. Just download the latest release of their prefconfigured demo environment, and browse through the samples: ETL, Reporting, OLAP, Datamining, Dashboarding, Graphing - all under one hood, integrated in a workflow engine. Naturally, I believe that the Gold Partnership between Pentaho and MySQL is a good move that will lead to many mutual advantages.

So, I think it's not hard to see that this combination is one where both parties gain tremendous value from each other. Actually, it's probably not at all an exaggeration to state that you literally do not need any other branch of products or tools to implement your long term enterprise BI solution: just MySQL and Pentaho will prove to be entirely sufficient for most purposes. (But yeah, your business users will still use Excel Spreadsheets - and: let them if they want to! - Pentaho integrates there too).

Of course, time will have to tell if I'm right on this one, but I think it's likely that Pentaho has the potential to give the P in LAMP a whole new meaning.

That's not to say things are exactly perfect right now. Pentaho is sensitive to that, and are committed to make it increasingly easy for MySQL users to get the power of Pentaho at their fingertips. And to do that, they need help. Actually, Pentaho, like MySQL and a lot of other open source projects, rely on feedback of those users that know what's best for them. This is what Nicholas Goodman has to say about this:


We want to understand how to make it increasingly easy to use Pentaho with MySQL. In return for providing Pentaho with much needed feedback on ease of use and the user experience for installation/configuration Pentaho is giving away a Mac Mini.


Of course, I don't know if it counts, but here are a few of my thoughts regarding this:


  • Add a (web)interface to Pentaho that makes it easy to install and/or upgrade the Connector/J, the MySQL jdbc driver, so that it can be used to build pentaho solutions. Right now, this is not really difficult (read more), but it is a bit of a hassle. Another wild idea would be to have a Pentaho workflow check the download site to see if there's a newer version of Connector/J so it can alert the administrator or prompt for upgrade.

  • Include a setup for the sakila sample database, and add some samples that demonstrate Pentaho using this database.



With regard to my second suggestion - a little while ago, I wrote a little article on kettle - now officially called "Pentaho Data Integration". In that article, I only scratched the surface, but I already promised then to write another article to illustrate a slightly more realistic use case. Well, I still intend to do that, and it should pop up in the next few days on my blog. So, if you're interested, stay tuned.

If you want to know more about MySQL and Pentaho and how these two can strengthen each other, download and view the webinar.

No comments:

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...