Friday, December 23, 2005

Reporting for the Masses: Eclipse/BIRT, Apache Tomcat and MySQL

I got it working...

Eclipse eclipse logo


About two monts or so ago, my attention was caught by that remarkable open source product, Eclipse. (Maybe a lot of people are saying "..duh!.." out loud right now, but hey, I can't hear you)

I wrote the MySQL connector/J example accompanying the article in the Call Level Interface section of Andrew Gilfrin's site www.mysqldevelopment.com, using eclipse, very much to my satisfaction.

When I started to write the example, I was using sun's netbeans, but it was simply too slow. Especially as my source grew bigger (I'm talking only several hundreds of lines here, shouldn't be a big deal) the editor was just lagging and freezing up to a level that simply wasn't acceptable to me. Clearly, I needed something else and eclipse happened to be the first thing I thought of using instead.

BIRT eclipse BIRT logo


Browsing through some of the other, non-core, eclipse projects I noticed the BIRT project. And, quite shortly after becoming aware of BIRT's existence, an article was published on the MySQL website: Using BIRT To Report On Bugzilla in MySQL. I knew then that I had to check it out for sure.

For those that never heard about it, BIRT stands for: "Business Intelligence and Reporting Tools". BIRT lets you extract and present data stored in various types of data sources (such as relational databases, but also comma seperated files, and some others). Such a presentation, or report, can be a powerful tool for the upper and middle management to analyze, control and plan their business. (You can take a look at a DEMO flash movie here)

Of course, reports can have a much more modest purposes too. For example, an order entry system could be required to print out an entire order including order lines so it can be printed out and sent to the client as an invoice.

Now, I have worked with some closed source Business Intelligence and Reporting Tools, and I think it's fair to say that at this moment, BIRT is best characterized as a reporting tool rather than a Business Intelligence tool.

True Business Intelligence tools offer all kinds of instruments to analyze data, quite often in a dynamic, flexible, ad hoc manner. At this moment, BIRT is not such a tool. Those that are wondering what tools qualify as true Business Intelligence tools: I'm thinking of OLAP and Datamining tools such Cognos Powerplay, Oracle Discoverer, Microsoft Analysis Services.

BIRT let's you define sets of data (for example, from a SQL query) and then render that data into a mainly static layout. There is quite some choice of course: Data maybe rendered as simply plaintext, or in a complex tabular grouped format or even as a colourful graph or diagram. It should be possible to add some user interaction too, because you can put hyperlinks and javascript into the report (haven't tried that yet though).

There is also limited support for some analytics. The data drawn from the query can be grouped in various levels. Aggregate functions can be independantly applied to these groupings, showing both detail rows as well as the aggregates (think of a grand total for all ordered items on the invoice).

Actually, I think BIRT bears quite some resemblance to Microsoft Reporting Services. This is especially true when designing Reports. The Eclipse IDE, in which you design BIRT reports, matches Microsoft Visual Studio (which is used to create MS Reporting Services Reports). Both reporting platforms are extensible. Of course, BIRT is so by definition, as it is open source; but the methods for extending Reporting Services are documented and the specification is available. The language in which reports are defined is an XML format for both platforms .

(BTW - the XML-format this is something I like very much in both these products, It offers one all kinds of opportunities to manipulate report sources. For example, because this is all XML, it shouldn't be too hard to translate reports sources from one platform to the other using simple XSLT transformations. Another thing you could do is build your own implementation, generating source code in say, PHP. Again, all you'd need are XSLT transformations).

Of course, there are also some differences between these products. Here are some obvious ones which do not directly impact the functionality of the report:


  • BIRT is open source under the Eclipse Public License - Reporting Services is closed proprietary source software, subject to its own product license

  • BIRT is based on java technology - Reporting Services is built on .NET technology

  • BIRT has jdbc database connectivity - Reporting Services has ODBC and ADO .NET connectivity

  • Syntactically, BIRT expressions are Javascript expression - Reporting Services expression are Visual Basic .NET expressions

  • BIRT does *NOT* have crosstabs as of yet - Reporting Services has of course

  • BIRT datasources are defined on the report level - Reporting Services datasources are defined on the Report Server and/or Project level (wich I think is much more convenient than in BIRT)

  • A lot of output formats are available for Reporting Services - BIRT can only output HTML or PDF (I'm really missing an XML output format)



There are significant differences in Architecture:


  • Although you can use Microsoft Reporting services on various databases (ODBC, ADO) you really need a Microsoft SQL Server instance, always! The report data is processed and cached in a SQL Server database, no matter the actual data source. The actual report is run from the cached data. BIRT is dependant only upon a runtime library. No datastore is required to cache data.

  • Reporting Services is integrated into Microsoft SQL Server and Microsoft Internet Information Services (webserver). So, there, deployment is really easy. BIRT is just a reporting tool. You'll have to hook it up to a Application Server yourself - one capable of running java servlets. Deployment is then done manually, copying the source files to the application server



Tomcat Apache Tomcat logo


Of course, for it to be a serious solution, BIRT really needs to be hooked up to an application server. You can't expect your users to open eclipse to preview their report.

Luckily, the BIRT manual explains exactly how you can run your reports from your Apache Tomcat application server. Some time ago, I installed one of those too, but I never got round to playing around with it. This was my first chance to do that too.

I expected it to be big deal - it's not. In a minimal setup, you need to create a BIRT directory under tomcat's webapps directory and copy one of BIRT's components, the report viewer to that directory. It's a walk in the park, I had it running withing minutes, having read the relevant chapter in the BIRT manual.

MySQL - Sakila MySQL logo



I used the sakila sample database (download here) to do some reporting.

I devised two little queries that I think would be useful had I started a career as a DVD rental shop owner: Rentals per store through time, and Rentals per category.

Here's Rentals per store:

select DATE_FORMAT(last_day(r.rent_date),'%Y') as rent_year
, DATE_FORMAT(last_day(r.rent_date),'%c') as rent_month_number
, DATE_FORMAT(last_day(r.rent_date),'%b') as rent_month_abbreviation
, i.store_id as store
, count(*) as count_rentals
from rental r
inner join inventory i
on r.inventory_id = i.inventory_id
group by rent_year
, rent_month_number
, store


I rendered it as bar graph:

a bar graph of sakila rentals per store through time


Here's Rentals per category:

select c.name category
, count(*)
from rental r
inner join inventory i
on r.inventory_id = i.inventory_id
inner join film f
on i.film_id = f.film_id
left join category c
on f.category_id = c.category_id
group by category


I rendered it as a Piechart:

a pie chart of sakila rentals per category


Quite interesting results, actually. I'm not experienced in DVD rentals, but this does look like quite a homogenous set of data to me. Anyway, I just heard Mike Hillyer is going to discuss the sakila database at the MySQL user conference in april 2006. We'll probably hear exactly how he generated this set.

I just want to add that it took me just a couple of hours to explore BIRT and the underlying XML format (dont worry, designing is all GUI) enough to come up with these. Apart from these graphs, my first report also includes a table of rentals per store with subtotals for several grouping levels. I skipped the tutorials, I just watched the Demo Flash movie (link in the top). My conclusion is that eclipse/BIRT is intuitive enough to build serious reports.

Also, I'm impressed with the performance running the reports. I'm running eclipse/birt, mysql 5.0.17 and Tomcat all on my 1700 Mhz/ 1G Ram Thinkpad, and the graphs appear almost instantly (Yup, disabled my browser cache - it really is fast). We are looking at 16088 rows of rental data - I'm impressed this speed is possible even without the data cache such a used by Microsft Reporting services.

Final thoughts


You can put together a Reporting Platform based exclusively on open source products. It's quite easy to set it all up. BIRT may not be as advanced as some comercial products, but because no financial investments are involved purchasing software, theres no risk in trying. In fact, I think that in a lot of cases, the functionality provided by BIRT is quite sufficient to serve most reporting requirements.

Although BIRT may lack some more advanced features such as crosstabs, most of these will probably be added within an overseeable amount of time. Take a look at the BIRT project plan to see what is going on. I'm quite convinced we won't have to wait too long before we see a crosstab inside BIRT

11 comments:

Anonymous said...

Hola Disculpa He leido Tu Publicación y se ve muy interesante.

Veras Soy de Venezuela y estoy haciendo mis pasantias, actualmente me encuentro trabajando con la tecnologñia descrita por ti; BIRt, MYSQL Y Apache tomcat pero no he podido conseguir hacer la conexción entre birt y el apache, serias tan amablñe de explicar un poco mejor ese punto.. Mi correo es edwynmayorga@gmail.com

Anonymous said...

To understand BIRT, you have to understand wenfeng li who named himself PMC lead of BIRT and the drive behind him.

Wenfeng li came to Actuate a few years ago and brought a huge lawsuit from his former employer. Then he acted crazily:

He proposed to recruit people from China after grabbing a lot of his old friends around him...

His multi million lawsuit partly forced Actuate to layoff several times...

Lately he start touting BIRT. The truth is that he has very limited knowledge and experience on Java technology and he doesn't understand too much about Actuate product as well.

The only way he can do is to tout himself and add version number to praise himself...

Please draw your own conclusion...

rpbouman said...

Hi mr Anonymous,

I think you comment is at least in a particular way relevant to the article, so my principle bids me to publish it.
In the future, please observe the basic courtesey, and sign your name underneath the comment when you want to make accusations. It makes it easier for the accused party to defend themselves.

I must admit that I have never heard of mr. wenfeng li. However, I can't say that I feel that my lack of knowing him has hampered my understanding of BIRT in anyway.

Although I know the basics of java, I surely can't boast on deep java knowledge myself, but I don't really see how this can be relevant.

I must admit that it is not really clear to me what you hold against BIRT or mr li for that matter, so please feel free to explain, and please sign the comment with your name so everybody knows where this is coming from.

thanks for understanding,

Roland Bouman.

Anonymous said...

How do you feel about BIRT comparing to other open source BI project?

Mr. wenfeng li used to claim to be NO.1 BI vendor in China by the end of 2006...

Mr. wenfeng li claims that millions of java developer are using BIRT and he boasts about the "massive success" about BIRT...

rpbouman said...

"How do you feel about BIRT comparing to other open source BI project?"

Well, at the time, jasper was the only open source reporting software. I think variation is usually good, because it drives competition. In comparing Jasper and BIRT, I'd say that the nice thing about BIRT is the good Eclipse integration.

I cannot say much in comparing intrinsic reporting capabilities. I'd say that functionalitywise, jasper is more advanced - not surprising considering it has been around much longer.

"Mr. wenfeng li used to claim to be NO.1 BI vendor in China by the end of 2006..."

Well, that could very well be. I must say that from a business perspective I don't usually get warm when people claim they are the "Xth open source Y" or the "Yth free software Z". I feel that the fact that something is open source or free software is a good extra reason that makes a product desirable. In a lot of cases, I don't think that the openness or freeness should be a prima facie reason to want the product.

"Mr. wenfeng li claims that millions of java developer are using BIRT and he boasts about the "massive success" about BIRT..."

Well, "millions" seems like a bit of an exaggeration. I do think you can call BIRT a success, and I think that certainly entitles him to some boasting if he's responsible for it.

Unknown said...

BIRT may not load and work due to java security restrictions. In the /etc/init.d/tomcat4 file, change Java security manager to "no" so it reads like this:

# Use the Java security manager? (yes/no)
TOMCAT4_SECURITY=no

Anonymous said...

Roland,

Have you found that Eclipse freezes when you try to add a aggregate function to a Data Set?

rpbouman said...

Hi Rupe,

No, I have not experienced that problem in particular. I have noticed that the reportdesigner can become incredibly sluggish as the layout becomes only moderately complex (for example with a table control).

With aggregates, I noticed that the editor seems to have a problem with multiple, similar aggregates using different scopes. So for example, when you try to render a report breakdown per year, quarter and month into a table, and try to calculate the same Aggregate, say, COUNT for each group, the editor just can't seem to remember these are all different. My current solution is to hack the XML directly for that, that sorta works.

kind regards,

Anonymous said...

What is your comparision between BIRT and Pentaho?

Also, "Microsoft opts to build, not buy, business software". What will this impact open source software in BI sector?

http://yahoo.reuters.com/news/articlehybrid.aspx?type=comktNews&storyID=urn:newsml:reuters.com:20070510:MTFH25913_2007-05-10_02-02-56_N09335242&pageNumber=1&imageid=&cap=&sz=13&WTModLoc=HybArt-C1-ArticlePage1

rpbouman said...

Hi Anonymous,

Regarding Pentaho and BIRT: well, BIRT is first and foremost a reporting engine and an Eclipse plugin to design reports.

Pentaho is essentially an entire stack that contains components form all areas in the BI arena: that includes reporting, but also OLAP tools, ETL/Data integration, Data Mining, Application server, business process integration, scheduling and distribution.

Pentaho is built in a way that it allows other J2EE applications to be plugged into it. So Pentaho can be used together with their own reporting engine, jFreeReport, or with BIRT, and even with Jasper reports.

So it does not make a lot of sense to compare these products. As far as the Pentaho jFreeReport reporting engine is concerned: I haven't played around with it a whole lot. I do notice that it extremely actively developed, but up till now I am still missing a tool to design the reports. Of course, they are there, but up untill now I have used BIRT for ease of use of the report designer.

However, I have recently been checking out an extremely interesting development in that area, the Pentaho ad-hoc report designer. This is an AJAX web application that allows end users to build reports on a metadata layer like you have in Oracle Discoverer, Business Objects and recently also MS BI. It is in the latest build of pentaho but it is under heavy development. I suspect that that is going to be the tool I will be moving too once it is stable.

Anonymous said...

re:



BIRT may not load and work due to java security restrictions. In the /etc/init.d/tomcat4 file, change Java security manager to "no" so it reads like this:

# Use the Java security manager? (yes/no)
TOMCAT4_SECURITY=no

By soarhevn, at 6:39 AM

THANK YOU THANK YOU - I have taken two days on and off trying to get mine to work and that solved it instantly. Not sure what it does but for the moment I don't care. I am sure I will need to know at some point but hey. Thanks again soarhevn and thanks Roland for hosting the blog :)

SAP HANA Trick: DISTINCT STRING_AGG

Nowadays, many SQL implementations offer some form of aggregate string concatenation function. Being an aggregate function, it has the effe...