Wednesday, August 11, 2010

Back to blogging....

It has been a while since I posted on my blog - in fact, I believe this is the first time ever that more than one month passed between posts since I started blogging. There are a couple of reasons for the lag:

  • Matt Casters, Jos van Dongen and me have spent a lot of time finalizing our forthcoming book, Pentaho Kettle Solutions (Wiley, ISBN: 978-0-470-63517-9). The book is currently being produced, and should be available according to schedule in early September 2010. If you're interested, you might like to read one of my earlier posts that explains the organization and outline of the book.

    (I should point out that we have reorganized the outline as the project progressed, so the final result will not have all the chapters mentioned in that post. We do however cover most of the topics mentioned.)

  • I have been checking out Quipu, a promising Open Source data warehouse management solution. Quipu provides a repository-based extensible code-generator that allows you to generate and maintain a data warehouse based on the Data Vault model. One of the things I want to do in the short term is write templates that allows it to work for MySQL, and after that, I want to see if I can get Quipu to generate Kettle Jobs and transformations.

  • I have been working on a couple of software projects. Two of them are currently available as open source on google code:

    This allows you to use the Metaweb Query Language (MQL) to query a RDBMS. If you're wondering what this is all about: MQL is the query language used by Freebase (which is a collaborative "database-of-everything" or "wikipedia-gone-database").

    While Freebase is interesting in its own right, I am particularly enthused about the MQL query language. I feel that MQL is an exceptionally good solution for flexible, expressive and secure data access for modern (AJAX) Web applications. Even though MQL was not developed with relational database systems in mind, I think it is a pretty good fit.

    Anyway, this is very much a work in progress, and I appreciate your feedback on it. If you're interested, you can read a bit more about my take on RDBMS data access for web applications and MQL on the mql-to-sql project home page. I have also put up an online demo that allows you to query the sakila MySQL sample database using MQL.

    This is a project that provides auto-documentation for Kettle (a.k.a. Pentaho Data Integration).

    The project consists of a bunch of Kettle Jobs and transformations (as well as some XSLT stylesheets) that extract data from Kettle Jobs and transformation and transform it into a collection of human-readable HTML documents along with a table of contents. The resulting documentation looks and feels a bit like JavaDoc documentation.

    If all goes well, I will be presenting kettle-cookbook in a Pentaho web seminar, which is currently scheduled for September 15, 2010

  • I've been enjoying 4 weeks of vacation (I started working this week). There's not much to tell about that, other than that it was great spending a lot of time with my family. I plan to do this more often, and now that Kettle Solutions is finished I should be able to find more time to do it.

  • I've been looking at two emerging HTML5 APIs for client-side structured storage, the Web SQL Database API and the Indexed DB API.

    I have developed a few thoughts about the ongoing debate (see this and article for some background) about which one is better and I see a role for something like MQL here too. I will probably write something about this in the next few weeks

  • Yesterday, I got wind of the JS1k contest! Basically, the challenge is to write an interesting standalone JavaScript demo program that must be no larger than 1024 bytes. It is amazing and inspiring to see what people manage to do with a modern browser in 1k of JavaScript code.

    I decided to try it myself, and you can find my submission here: An interactive SQL query tool for the Web SQL DB API.

    Essentially, you get a textarea where you can enter arbitrary SQLite queries, and button to execute the SQL, and a result area that will print a feedback message and a result table (if applicable). As a bonus, there's a button to get a listing of the available database objects (using a query on sqlite_master) and an explain button to show the query plan of the current SQL statement.

    The demo works on recent versions of Google Chrome, Apple Safari and Opera. It can run offline, and does not require any plugins. I should say that I expect my submission will be rejected by the judges since the demo is not functional on Mozilla Firefox, which is a requirement. (That is, the script will detect that the Web SQL Database API is not supported and print a message to that effect). However, it was still fun to try my hand at it.

Ok - that's it for now. I will try and post more regularly and write about these and other things in the near future. Don't hesitate to leave a comment if you have any questions or suggestions.


Anonymous said...

I'm waiting for the book! I've the other PDI book which is great too. I hope this adds to that. It beats hunting down for answers on the wiki/web.

Shlomi Noach said...

Good to see you back!

LenZ said...

Hi Roland! Welcome back - good to see you around. Thanks for the pointer to the JS1k contest. I'm surprised by the speed differences when looking at the demos with Firefox 3.6 and Chromium...

rpbouman said...

Hi all! thanks for the comments :)

@Lenz: by "speed difference" you mean that Firefox is slow and chromium is fast? At least, that is the impression I get. For regular browsing, I use chrome. Firefox (+firebird) is my development platform.

BTW, nowadays, Opera is more and more often the winner in JavaScript benchmarks. I think its amazing to see how such a relatively small company, with a marginal market share too, manages to deliver a browser of such a high quality. Simply awesome :) Pity it isn't open source.

Unknown said...

Good to have you back. I had a moment to take a glance at Quipu. Looks interesting and I look forward to reading a blog post your experiences with the application.

Josep Curto said...

Congratulations for finishing the book!

Gustavo Lopez said...

Good morning Mat,

Let me start off by congratulating you for such a great book "Pentaho Kettle Solutions" which I purchased in Amazon. It has been of great use to me. I have been using PDI for about 1.5 years now, but recently I have a problem that I hope you could help me with. Where should I post this ?

rpbouman said...

Hi Gustavo!

(this is Roland. Matt's blog is here

Thanks for your support! I'm glad the book is useful to you.

If you just need help with some problem, the best spot is probably, or otherwise the ##pentaho IRC channel on freenode.

Thanks, and kind regards,

Roland Bouman

swathi said...

I want to generate report using struts2.0,jsp
it is generating but when i want to generate report in client side the preview is not generated in client side its generating in my system(i.e, Server)
how to generate report in client side...
and my code for generating report is......
ResourceManager manager = new ResourceManager();
URLConnection uc;
URL url = null;
String s = request.getContextPath();
final String path = System.getProperty("catalina.home") + "/webapps" + s + "/reports/"+"job_card.prpt" ;
url = new URL("http://localhost:8080/"+s+"/reports/job_card.prpt");
// url = new URL(pathimage);
try {
uc = url.openConnection();
} catch (IOException e) {
org.pentaho.reporting.libraries.resourceloader.Resource res = manager.createDirectly(url, MasterReport.class);

System.out.println("before pdf report creation");
try {
HtmlReportUtil.createDirectoryHTML((MasterReport) ((org.pentaho.reporting.libraries.resourceloader.Resource) res).getResource(), "http://localhost:8080/LvhtGre1/reports/reportgre.pdf");
} catch (ReportProcessingException e) {
log.setError(e, "Print Preview in GRE");
} catch (IOException e) {
log.setError(e, "Print Preview in GRE");
MasterReport report = (MasterReport) ((org.pentaho.reporting.libraries.resourceloader.Resource) res).getResource();
ReportParameterValues paramValues = report.getParameterValues();

int code = Integer.parseInt(goodsReceivingEntry.getCode().toString());
paramValues.put("jobcode", code);
paramValues.put("plant_id", plant_id);
// Set Dimension
Toolkit tk = Toolkit.getDefaultToolkit();
Dimension d = tk.getScreenSize();
int width = d.width - 400;
int height = d.height - 100;
int X = (d.width / 2) - (width / 2); // Center horizontally.
int Y = (d.height / 2) - (height / 2); // Center vertically.

try {

HtmlReportUtil.createStreamHTML(report, "http://localhost:8080/LvhtGre1/reports/reportgre.html");

} catch (ReportProcessingException e) {
log.setError(e, "Print Preview in GRE");
} catch (IOException e) {
log.setError(e, "Print Preview in GRE");

final PreviewDialog preview = new PreviewDialog(report);
preview.setBounds(X, Y, width, height);
preview.setPreferredSize(new Dimension(width, height));
preview.setMaximumSize(new Dimension(width, height));
preview.setMinimumSize(new Dimension(width, height));
// preview.setVisible(false);

rpbouman said...

Hi swathi, please post questions like this on a general forum.

DuckDB bag of tricks: Processing PGN chess games with DuckDB - Rolling up each game's lines into a single game row (6/6)

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is the sixth installment of a s...