Locations of visitors to this page Book: "Pentaho Kettle Solutions", Matt Casters, Roland Bouman, & Jos van Dongen, Wiley 2010 Book: "Pentaho Solutions", Roland Bouman & Jos van Dongen, Wiley 2009

Saturday, June 02, 2007

Google Gears: Webbrowser embedded Database

Google Gears is an open source browser extension created by Google. It provides a framework that allows the creation of offline webbrowser applications. At the moment it provides three services:

Local Server

A data store for static resources. This allows efficient client-side caching of images, (client-side) scripts and webpages

Database

An embedded relational database management system, based on SQLite. SQLite should be familiar to most PHP developers, as it is being shipped with PHP since version 5

Worker Pool

A form of threading support inside the browser that allows webapplications to initiate long running processes without hampering the responsiveness of the user interface.
All these services can be accessed from within the browser using a javascript API.

Users only needs to install the extension in order for the browser to be able to access the services when browsing pages.

If you want to get an immediate taste of the database service, be sure to install Google Gears and take a look at my offline, browser-based database client.

Currently, the extension is available for mozilla Firefox and Internet Explorer.

What does this buy us?


Google Gears offers an enormous potential for web application developers, for a number of reasons:
  • The local server can be used to setup a specialized cache that can help to run web applications off-line.
  • The database service offers not only a structural solution for maintaining state at the client side, it even allows for persistence. That is, if you close the browser, shut down the computer, and restart, the application can pick up where it left off!
  • I am not really sure about the worker pool but I can imagine this can be very useful to perform tasks that require a connection with a server. For example, the worker could be instantiated that attempts to synchronize the local state and data gathered by the application to a webservice that acts as the terminal destination.
Of course, we all have to see real-world results, but I think this truly is a revolutionary development. It finally makes sense to start thinking of "browser applications" instead of "web applications", because the application does not necessarily have to connect to a HTTP server anymore to be useful.

I would not be surprised if this would initiate a new wave of emancipation of relational databases. Literally everybody that can operate a webbrowser will now own a database, and all the glue to make it work together with the user interface is already there. Also, a number of mobile service applications might be implemented on top the Google Gears embedded SQLite database instead of a specific standalone database product.

The web browser offers a reasonably sufficient environment for creating data oriented end-user applications, and the combination might prove to be a killer solution. The SQLite database can still be accessed outside the browser too, and be integrated with other applications.

Prerequisites

There are few preparations that need to be taken before you can use google gears in a web page. The prerequisites are that the browser that will view the page has the Google Gears browser extension installed. The page itself needs to include the gears_init.js script in order to use the javascript API to access the services. This single line in the <head> element of the page allows access to the API:

<script type="text/javascript" src="gears_init.js"></script>

The gears_init.js script needs to reside at the same location as the page in this case, but it may be referenced from any location.

One of the fascinating things is that neither the page nor the script need be on a webserver. You can simple store the script and the page on the local filesystem.

When opening the page in the browser, you will be prompted to allow the page to access the Google Gears services:
GG_Warning
So the user needs to allow access for each different site that wants to use the services.

A closer look at the Database service

The Google Gears database service can be accessed using a simple API. All of the Google Gears APIs are javascript (ecmascript) APIs.

Creating a database object and opening a database


First a database object needs to be created using a line like this:

var db = google.gears.factory.create("beta.database", "1.0");

The variable db can then be used as a handle to open the actual database. The following line will create a database named GGDB:

db.open("GGDB");

Database names are unique per application, or rather origin. All pages that are located in a particular scheme, host and port are considered to be part of the same application. (For the web location http://www.foo.com/, http is the scheme, www.foo.com is the domain and the port is the default port for the scheme, which is 80 in this case). Pages can only access databases that were created by a page in the same domain.

This 'same domain' restriction is an understandable security measure: if we visit https://www.mybank.com/, we do not want the data stored in any of the databases used by that site to be accessible by pages from http://www.bankrobbers.com/.

Creating a table

The database object implements the execute method, which allows one to execute SQL statements. For example, if we want to create a table to store contacts, we could try something like this:

db.execute(
"CREATE TABLE persons("
+ " id INTEGER PRIMARY KEY AUTOINCREMENT"
+ ", first_name"
+ ", last_name"
+ ")"
);
When you are not used to SQLite's specific philosophy of data storage, the previous statement must've made you frown at least once.

Yup people, SQLite does not require us to specify column data types. It is allowed to include an identifier where you'd normally put a data type name. However, this 'data type' does not restrict the values that maybe stored in that column. There is only one exception to that rule: when a column is defined with INTEGER PRIMARY KEY AUTOINCREMENT, it denotes a surrogate key that will store only integral values.

In spite of the fact that SQLite does not use data types for table definitions, SQLite does have a type system. SQLite has a concept called storage classes: each value entered into the database is associated with one of the following storage classes:

    NULL

    A special storage class for SQL NULL values

    INTEGER

    Storeas integral values

    REAL

    Stores numerical floating point values

    TEXT

    Variable length strings

    BLOB

    Variable length binary strings

However, a storage class is bound on the level of a value, not a column. So, one column can contain multiple values, and each is associated to one of the storage classes.

Inserting data


In Google Gears, all interaction with the database is performed by issuing SQL statements through the execute method of the Database object. So, if we want to insert data into our persons table, we need to do this with a SQL INSERT statement.

The first argument of the execute method is the actual SQL statement, and we can use this to pass a complete INSERT statement, including any value literals. However, we can also use parameterized statements, using placeholders instead of value literals. The remainder of the argumentlist must then be used to provide values for the placeholders in the SQL statement text:

db.execute(
"INSERT INTO persons (first_name,last_name)"
+ "VALUES (?,?)"
, "Roland"
, "Bouman"
);
This manner of passing values offers the advantage that you do not need to escape quotes from the values: this is automatically taken care of by SQLite.

Processing resultsets

A call to the execute method always returns a ResultSet object. In the previous examples, execution of the statements did not result in a set of rows, but a javascript ResultSet object is returned regardless.

The following snippet illustrates how you can process the rows in the ResultSet object:

var result = db.execute( //execute query, get result
"SELECT * FROM persons"
);
var fieldCount = result.fieldCount(); //get number of columns
while(result.isValidRow()){ //as long as we did not process all rows
for(var i=0; i<fieldCount; i++){ //loop over all columns
var value = result.field(i); //get value of the current column
}
result.next(); //process the next row
}
result.close();
The fieldCount method of the ResultSet object can be used to return the number of columns associated with the ResultSet object. If the SQL statement that yielded the ResultSet object did not return a set of rows, the fieldCount method returns 0.

The isValidRow() method of the ResultSet object returns a boolean that indicates whether there are still rows associated with the ResultSet object that may be processed. The next() method of the ResultSet object proceeds to the next row in the resultset. After calling the next() method, the isValidRow() method must be used to detect whether the resultset is already exhausted. Together, the isValidRow() and next() methods can be uses this to drive a loop to iterate over all the rows associated with the ResultSet object.

The field(columnPosition) method of the ResultSet object can be used to obtain the value for the column at the specified position. The fieldByName(columnName) method of the ResultSet object can also be used to obtain the value of a column, but as indicated by the method name, the column must be specified using the name of the column. The fieldName(columnPosition) may be used to retrieve the name of the column at the specified position.

When we are done processing the resultset, we are required to call the close() method of the ResultSet object. There are plans to implement a way to automatically close the ResultSet object when it isn't used anymore, but even then it can't hurt to close the ResultSet object explicitly.

Handling Errors

Although Google Gears is great, we still are not in a perfect world. Runtime errors may and will occur, and to handle them you need to use the javascript try...catch syntax. This is not so much a Google Gears thing, but as the database API will throw a runtime error for syntax errors, database constraint violations etcetera, you simply myst use try...catch blocks everywhere.

The following snippet illustrates how to do this:

try {

//stuff that can go wrong

} catch(e){

//use the message property
//from the Exception object
alert(e.message);

}

A Quick and Dirty Command Line Client


To play around with the Google Gears Database service, I made a quick and dirty browser-based Command line client for the SQLite database. You can use the online version here. If you want to use it offline, no problem, just download save the page anywhere on your local disk. Just be sure to save a copy of the gears_init.js script in the same directory. That's it....on to the offline web!

3 comments:

Sam Sethi said...

Hi Roland

Great post I am wondering why GGear uses SQL to SQlite and yet GData uses RESTful HTTP Query request to an online datastore.

Given that the offline datastore is accessed via the browser, a RESTful UI to the local store would have made more sense to me and equally it would have been consistent whether offline or online.

Any thoughts greatly appreciated.

Dennis Decoene said...

The idea behind Gears if really fantastic. A concept I've been dreaming of for many years.
However, from what I read here and from other sources, it seems like the whole app has to be written entirely in javascript/html and run on the local machine.

This comes awfully close to traditional desktop apps with their inherent problems of deployment, cross platform compatibility and maintainability.

Is an app written in this sense not vunerable to the same kind of problems as a regular desktop app, with the additional problems of Javascript?

If so, I don't see the advantage right now of this versus a traditional client. Do you?

Roland Bouman said...

Sam,

well, personally, I very much like the idea of having a relational database as an application data store. SQL is a well defined, powerful data manipulation language whereas REST is a style of data exchange.

With REST, a proper language for searching, grouping etc needs to be invented, standardized, debugged...and adopted by developers. Sticking to a well known, widely adopted language remedies all this. Also, it will make it easier to move traditional client/server applications to "offline web" applications.

So I think Google did it exactly right in this respect ;)

Dennis: Well, it is true that application must consist of resources that your web browser can handle. This will be html/css/javascript about 90% of the time, but as far as I understand, there is no reason why this should not also include XSLT, XUL, applet .class files, ....

"This comes awfully close to traditional desktop apps with their inherent problems of deployment, cross platform compatibility and maintainability."

Well, no, not in my point of view. With Google Gears, the application can of course be a hybrid between a true web application that uses the offline capabilities for failover ("Sometimes connected applications").

But let's forget those hybrids for a moment and an focus only on the offline capabilities.

First of all, you have a free builtin update service for your application. And, in many cases, updates will come in as http over port 80, so not some obscure application specific port and protocol where the user has to worry whether it should or should not be blocked by the firewall.

Second, you seem to dislike the web client side as an application development environment...whereas I love it! I mean, html + css is a very powerful, declarative way to 'paint' your application. It is of course a matter of taste but I can be more (way more) productive with that than with, say, java swing or windows forms or what have you. So I would certainly call that an advantage.

You also mention some "additional problems of javascript". I don't really know what you mean by this. I think javascript has one issue IMO and that is the lack of class inheritence. I can imagine that javascript may not be suitable for really heavy crunching jobs, although I don't think that is an issue for many apps.

If you mean that there are still browser compatibility problems with javascript and html/css, well that is true. But, and this is IMO a big difference with say 2 years ago, we now all have access to truly beatiful and excellent open source javascript frameworks like YUI/EXT and Dojo to name a few. And any initial lags because of loading the javascript libs can now be solved with the local server feature of google gears.

More advantages of web-based applications: web browsers are around everywhere and need no additional installation. That means there is no hurdle for application adoption, and no platform dependencies. Granted you have browser incompatibilities, but like I said - that is pretty much solved when you stick to a javascript framework.

As a last advantage I want to key here is the sheer number of web developers that are around today. Think of it - that is an entire generation of programming power that has thus far been confined to developing connected web applications...and suddenly, they can build applications with client/server like properties...or even just client applications! And think of it, the code these applications will be running will all be human readable, in many cases probably released under some open source license....so where does that leave the traditional desktop application ;)

I can continue and sum up many advantages for hybrid applications too, but I think you get the idea ;)

regards, and thanks for your comments,

Roland.