Friday, January 02, 2009

Writing a Book: Building Pentaho Solutions

Ok - this has been stewing for some time now, and I think now is the right time to announce that I am working together with Jos van Dongen from Tholis Consulting to create a book for Wiley with the tentative title "Building Pentaho Solutions".

My personal aim is to make this book the primary point of reference for DBAs and Application Developers that are familiar with Open Source products like MySQL and PostgreSQL but have no prior BI skills, as well as BI professionals that are familiar with closed source BI products like Microsoft BI and Business Objects that want to learn how to get things done with Pentaho.

The book will cover all distinct components and sub-products that make up the Pentaho BI Suite. For each component (and where applicable), installation, usage and maintenance are discussed and illustrated.

Background theory is given as needed to provide context for those readers with no prior BI knowledge or experience. This means we will cover topics such as dimensional modeling, data warehousing, data integration and much more. At the same time, the book will have a strong "hands-on" focus, and for that purpose, Jos and I have put together a fairly realistic online DVD Rental Company,

We went as far as creating different database schemas for the operational applications (for customer orders, inventory management, and purchase orders), and generating non-trivial volumes of example data for it (tables with 100k to 10m rows), covering over 7 years of DVD rental business - and this is just the operational system. And although generated, it's not just random data - things like customer age and location distribution, per year and per week ordering behaviour etc. - it's all in our sample data.

In our schemas and sample solutions, we are making an effort to cater to MS SQL, MySQL, Oracle and PostgreSQL users. We will also explore a few highly interesting open source database products like LucidDB and MonetDB.

Anyway - I will soon post some more details about the book and it's contents as we progress. I have taken Baron Schwartz's advice on writing a book to heart, and we are working from a pretty detailed outline trying to meet a darn-tight schedule.

If all goes well, we should end up with a 450-500 page book in a store near you by August 2009.

Jos and I are in the final stages of writing the book. We will then have to process reviewer comments and set up the book's website (including all code samples).

We've had a few changes to our schedule. Instead of a 400-500 page book, it will be a 500-600 page book. It was due to be available late August, but this will now be Early September.

The book can be pre-ordered on Amazon. More information on the book can be found on the Wiley website

UPDATE2: The book is in print and available since late August 2009! So far, we've had a couple of very nice reviews.

The book is available directly from the Wiley site and also from Amazon. Wiley also provides an e-book. If the link to the e-book is not working or claiming the book is not available as e-book then you most likely ended up on the European version of the website. To fix that and order the e-book anyway, click the "choose location" link on the Wiley homepage, and set it to United States and try again.


Anonymous said...

Hi Roland,

I'm sure the book will be very a good read. Good luck!


Feris Thia said...

Hi Roland,

As Pentaho system integrator partner I eagerly waiting for the book that may serve as a better guide for my existing customers and for people to know more about Pentaho.



Anonymous said...

Sounds like your book will fill a void. A small but important part of a service that we were building required some ETL and I looked for a book but didn't find anything. The (PDI) documentation sort of assumes that you work as a BI professional. Don't get me wrong - as far as open source documentation goes there is far worse that Pentaho. The biggest problem I had is that even if the individual steps (transformations) are well documented it's not readily apparent which ones you need together to achieve the goal.


Anonymous said...

Roland - excellent news! We've had many requests for a book around Pentaho that also covers the basics of DW/BI. I'm sure this will be a big hit for you and Wiley. We look forward to the release in August 09 and will be happy to help promote...


rpbouman said...

Hi All,

thank you very much for the kind words and encouragements!!

I am really thrilled to hear you think this book may fill a gap. Frankly, Jos and I, and obviously our editor at Wiley (Robert Elliot) are also convinced this is the right time for a book like this, but it sure feels good to see this confirmed.

Please stay tuned - I'll be reporting regularly on our progress.

Kind regards, and thanks in advance,

Roland Bouman

Anonymous said...

Thats amazing news!

I just came across this posting while looking for a quick reference in the Building Pentaho Solutions PDF. Only problem is that after i read your post i cant remember what i needed a reference for... *Haza goes back to that juicy looking raw XML*.

Im looking forward to this one!

All the best,


wselwood said...

Very good news. Good luck and I hope it goes well for you.


Unknown said...

I need your book right now !!! Please Hurry up !!! I will pay that you want !!!.

O.k. your job with pentaho is amazing, i've been reading your posts about pentaho and they are amazing... sorry for my english... i speak spanish... bye !!!

ISSAM said...

I need it tooooooooooooooo !!

ISSAM said...

I need it tooooooooooooooooo !

Anonymous said...

I completely agree with Jason above; this book will fill a much-needed gap. I'm currently trying to evaluate Pentaho, but am having a hard time figuring out which transformations will get me where I want.

Their online documentation is ok, but as they're a subscription service with a subscriber community, I get the impression that they're keeping the learning curve intentionally steep.

Anonymous said...

Yes we need this ASAP so if you need it reviewed pre-printing let us know.

Admin said...

This is exellent news. If you need any pre-print reviewer, I am second to volunteer.

Anonymous said...


If you want you could maybe include the doc i did up on setting up Pentaho BI server 2.0 (@

Let us know if you are interested.

- Prashant (schone @ ##pentaho)

rpbouman said...

Hi Schone!

thanks for the link! I'll take a look at it shortly.

How are you doing? How's the ext/js app coming along? I should spend some more time at irc, but I just can't find the time.

Anyways, thanks again!


Anonymous said...

Great news. I wish you success.

Anonymous said...

I really am looking forward to this book.Good Info on pentaho is scattered and difficult.This is one book il be pre of luck over the coming months completing it.i hope it is a success.

Anonymous said...

The community desperately needs this book! Looking forward to its release.

rpbouman said...

Hi Anonymous!

I am delighted to say we're almost ready writing the draft. We will then need to process the reviews, which I expect to be all on schedule.

Currently the book is slated to be released early September 2009. You can already pre-order if you like:

Anonymous said...

when you are planning to Realease the Book ??

Anonymous said...

When you are planning to Launch the Book ???

rpbouman said...

Hi Anonymous,

It's all in the hands of the publisher now. The good news is, the way things are going now, it will be available as of August 17, 2009, some 2 to 3 weeks earlier than originally estimated.

So, if you want to benefit from the Amazon pre-order discount, you have to hurry.

Hope this helps,


Anonymous said...

Do you have any plan of releasing PDF version of this book?

rpbouman said...

Hi Anonymous,

Good question! Really, it's up to my publisher. I think they do have e-books, but I'm not sure if and when there will be one for "Pentaho Solutions".



Steve said...

I bought your book. VERY HELPFUL! THANKS!

Anonymous said...

Just ordered your book....looking forward to it - roco

Level15 said...

Hi: Small question. I am considering the buy the book. Is it focused on the Pentaho Enterprise product or the community product?


Level15 said...

Hi: I am considering to order the book, but want to know first. Is it focused on the community edition of Pentaho or in the Enterprise (aka non free) edition?


rpbouman said...

Hi Level15,

thanks for your interest! I hope the book will be useful to you :)

the book is almost exclusively focussed on the community edition. We do mention and briefly discuss some benefits of the enterprise edition, but we wrote the book and examples using the community edition.

I hope this helps,

kind regards,

Anonymous said...

I am facing a lot of problem in design studio.
Your book has any chapter dedicated for action sequence creation in design studio

rpbouman said...

Hi Anonymous,

yes, we cover action sequences and pentaho design studio. However, Pentaho Solution is a really broad, overall book, so we don't delve into the nitty gritty details.

But for example, Chapter 4, "The Pentaho BI Stack" describes concepts of action sequences, and in chapter 14, we describe how you can use action sequences to control the schedular, and how to use action sequences to do complex report generation and delivery ("bursting").

You can also take a look at the pentaho wiki - it contains a pretty detailed description of the action sequence steps.

Anonymous said...

Hi Roland. I bought the book. It is useful. Thank you. After I installed (after your advices from the book) Ubuntu, Mysql, Sqleonardo, etc. I have problems with Tomcat. I saw that you integrated the applications in a way (for Linux, instalation in /opt/, etc). I'm familiar with linux in general (php+mysql) but not with Java and Tomcat. I'll be glad for a short tutorial more custom for the book. Thank you again!

rpbouman said...

Hi Anynomous!

All the information you're requesting, should be in Chapter 3 of the book. That said, it is the hairiest bit, and it is easy to get it wrong.

I can warmly recommend the Online guides prepared by Prashant Raju:

Anonymous said...

Roland - I purchased Pentaho Solutions on Amazon and was hoping to get on the companion website to cut and paste notes to myself on the how-to's of the installation of Pentaho / Ubuntu / Tomcat etc. I tried the site mentioned in (one of the first 3 chapters - can't find it now) and it wasn't valid. Is there a valid companion website? The book has been a huge help so far. I'm in chapter 3 trying to make sense of the Tomcat install / config... and pressing on. And then to learn Pentaho. It's a lot to bite off and chew, made much easier by your book. Thanks (and to Jos van Dongen too!).

Anonymous said...

Also, is the untar command on page 39 midway down out-of-date now? It says:

sudo tar -zxvf ~downloads/biserver-ce-CITRUS-M2.tar.gz

Should it read (today)?:

sudo tar -zxvf ~Downloads/biserver-ce-3.5.2.stable.tar.gz

If not, could we get a link to this CITRUS file? Googling etc yielded very little other than old posts by people debugging... Thanks!!

(Books are snapshots, so if this was an older filename, then I completely get it... just making sure)

rpbouman said...

Hi Anonymous!

thank you for purchasing the book - I'm glad it was useful for you so far.

As for the book's website, it's:

(Use the tabs to access the downloads)

As for the untar command - yes, that does seem out of date. My apologies :(

The command you're using does seem to be the right one, at least, 3.5.2 is the current version.

For a up to date guide on how to setup the tomcat server and moving the repository to mysql (or another database) please see the excellent guide by Prashant Raju:

kind regards, and thanks for the comments :)

Anonymous said...

I´m a reader of your book Pentaho Solutions. First of all I want to send you congratulations for your book.

On the other hand I ask you a question. In chapter 10 you said "Please refer to the book´s website to download all the transformations and jobs to load the World Class Movies data warehouse". I download all files in the book´s website but only some ETL´s it´s there. I have no problem to populate de dimensions but I don´t know how to populate de fact tables, in the book there were no samples and on the website either.

Thanks for your reply and I really pleased if you can send me the load fact transformations and jobs.

P.D. Sorry for my english, I can understand quite well but I write english bad.

Anonymous said...

“Integration of Pentaho with Liferay” Looking for input on this..

DuckDB bag of tricks: Processing PGN chess games with DuckDB - Rolling up each game's lines into a single game row (6/6)

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is the sixth installment of a s...