Saturday, February 11, 2006

Webenabling OracleXE - You can, Free of cost

In my day job, I'm quite involved as an Oracle Developer. As such, I'm developing and maintaining a particular web-application for one of my clients. Right now, I'm trying to deploy some of the application to OracleXE for several reasons. I was experiencing a slight problem there, because my application uses the 'Oracle mod PL/SQL gateway', wich is essentially an apache module that provides direct access to stored procuders via HTTP requests. You gueassed it...this is not included in the XE product.

Luckily, it can be solved entirely, without much effort and without cost. Just read on, if you want to know how.

Oracle mod_plsql


The architecture of the application is quite straightforward:


  1. The client side front end is made up entirely by the Internet Explorer browser.

  2. From the clients, http requests are issued against a HTTP Server: the OracleHTTPServer. This is just an ordinary Apache webserver pimped up with a plugin (an Apache module) in order to have it handle particular HTTP requests by an Oracle Database. Ultimately, the apache configuration (location directives and the like) is responsible for routing only particular requests to the pls_handler wich is the internal name for the plugin (the module itself is called mod_plsql).

  3. The pls_handler calls a stored procedure that resides in an oracle database. Connectivity issues are all handled via the Apache configuration, and the last part of the HTTP request uri is interpreted as a procedurename. If the URI has a query string (not and SQL query - I mean the ?name1=value1&name2=value2... bit you see in internet adresses), these are mapped to procedure parameters. Inside stored procedures, you can call a particular procedure (htp.p) to pass plaintext to a buffer. When the outmost procedure ends, the plugin uses the buffer contents and uses that as the HTTP reponse.

  4. The pls_handler passes the HTTP response on to the webserver, which sends it to the client that issued the original HTTP request.


For some more info, check out this FAQ.

What about php, asp and jsp?


So, the data access layer is stored inside the database. Queries or other procedure calls all reside inside the database. This is different in the dynamic web-page solutions such as asp (classic asp used javascript or vbscript; nowadays its all .NET), jsp (java) and php.

With mod_plsql, there's only a thin component inside the webserver that routes requests to the database. mod_plsql knows nothing about the underlying database - it just tries to call the procedures requested via the URI. In a dynamic webpage solution, the page needs to have at least some knowledge about the database schema to be able to interact with the database. In a minimal scenario, such a solution needs to know the name of at least one stored procedure or table. In a worst case scenario, there are multiple pages that each contain a couple of SQL statements in the form of string.

The advantage of a solution like mod_plsql is that all references to the database schema are located inside the database. If something is modified in the database structure, all dependant objects, including the stored procedures that make up the web application, are marked invalid, wich means they require to be recompiled. So, you are immediately aware what parts of the application need to be modified in concert.

Of course, mod_plsql is not the solution to all web application problems - far from it. For example, suppose you need your webpage to generate a png image or a pdf document on the fly - this is not a job that's particularly suitable for a database procedure. But in cases were the application is data-oriented or were one needs to develop only a HTTP Service that returns only data (for example, in an XML format) mod_plsql is actually quite good.

There's of course another disadvantage concerning mod_plsql. It is entirely specific to Oracle. In the early days (I think it was about 2000), the webserver and mod_plsql component were shipped separately (and had to be paid for separately too). Later, the webserver was bundled with the database product, wich was Ok as long as you purchased the database.

OracleXE and mod_plsql


Recently, Oracle released it's express edition, OracleXE. This is a free ('...as in bear') edition, that is to say; it is free of cost even for commercial deployment. Alas, this version does not include mod_plsql. So, webapplications that are dependant upon mod_plsql cannot be ported to Oracle XE without having a separate installation of the Oracle HTTP server including mod_plsql.

Although finding out that XE didn't include this functionality wasn't a big surprise, it was a bit disapointing. For a couple of reasons, my client needs to duplicate it's environment - with the webenabling - and the particular Oracle license prevents that. That is, the client only has a single processor license which means we may not run (even install) that software on other machines simultaneously. Given the cost-drive, I thought XE would be a perfect match as far as database capabilities are concerned, but I really need the mod_plsql functionality as well.

As it turns out, there is a very good solution for all of this:


So, instead of having a installation of the 'Oracle HTTP Server', you just plug mod_owa into your apache server yourself - works great!

mod_owa: first impression


You can download mod_owa from Doug's site. Both source as well as binaries for a number of platforms are distributed there. The site forms the online documentation, and the distribution includes documentation as well. However, I think this documentation is not suitable for those that have never worked with mod_plsql before. In fact, I recommend those people to read the official documentation first and to keep a copy nearby.

One of the first differences between mod_plsql and mod_owa is the way configuration is handeld. Mod_plsl ships with a couple of webpages you can use to edit the configuration such as adding a DAD (Database Access Descriptor, wich is like a 'virtual directory' for wich requests should be handled by the pls_handler). In the Oracle HTTP Server version I use, the mod_plsql specific configuration is stored in a little text file that means somthing to mod_plsql only. Mod_owa is entirely configurable through the apache configuration; something I like better than the original (especially since I'm always hacking the OracleHTTPServer configuration directly in a text editor afterwards anyway).

Another difference between Mod_owa and mod_plsql is the fact that you can plug mod_owa into an Apache 2.0 server. mod_plsql runs on Apache 1.3. (BTW, This is availble for mod_owa too). Altough the documentation suggests there could be made some improvements on the mod_owa implementation for Apache 2.0, I tink it's cool Doug has done something Oracle didn't take on as of yet. (Something wich I find quite bad, I mean, how long has Apache 2 been around?)

There are a couple of differences between the mapping of request uri's to database stored procedure calls. Its nothing big, but it really helps if you're aware of the differences.

Both mod_plsql and mod_owa derive the procedure name from the resource name part of the request uri. In both cases, the apache configuration is used to route particular uri's paths to the module. The resource name is interpreted as the procedure name (this may include an owner and package if it is a packaged procedure). If the request URI included (http) parameters, these are usually simply mapped to proceudre parameters:


--
-- pleas call me on the browser like this:
--
-- http://www.host.com/pls/dad/p_test?p_text=hello%20world
--
create or replace procedure p_test(
p_text varchar2
)
is
begin
htp.p(p_text);
end;


So here, there's a rather tight mapping between the URI query name - value pairs and the stored procedure parmeter name and value.
This works the same in both mod_plsql and mod_owa.

You can also use 'flexible parameter passing'. In this scheme, the procedure must be declared to accept PLSQL arrays (actually, it's called a PLSQL Table, wich is a integer indexed associative array). Flexible parameter passing is controlled from the HTTP request uri by prefixing the resource name with an exclamation mark (!).

Both mod_owa and mod_plsql support the so-called 4-parameter interface:


--
-- pleas call me on the browser like this:
--
-- http://www.host.com/pls/dad/!p_test?name1=value1&name2=value2...etc.
--
create or replace procedure p_test(
num_entries IN NUMBER
, name_array IN owa_util.vc_arr
, value_array IN owa_util.vc_arr
, reserved IN owa_util.vc_arr
);


Inside the procedure, the name_array and value_array arrays can be read to get to the actual values passed at the url. Each position present in either of the arrays corresponds to the same position in the ohter array, so for each position in either of the array's the original name/value pair can be reconstructed.

An interesting point is that mod_plsql is very picky about both the parameter names (must be exactly num_entries, name_array ..etc). Mod_owa is more lenient, and just tries to call the procedure derived from the URI. This could a potential security hazard id you're just blindly porting your code: mod_owa will allow any procedure that matches the proceure name and datatypes to be called, wherea mod_plsql needs to match all the names exactly.

Another difference is that mod_plsql also supports the 2-parameter interface. This is exactly like the 4-parameter interface, but without the first and last element in the parameterlist:


--
-- pleas call me on the browser like this:
--
-- http://www.host.com/pls/dad/!p_test?name1=value1&name2=value2...etc.
--
create or replace procedure p_test(
num_entries IN number
, name_array IN owa_util.vc_arr
, value_array IN owa_util.vc_arr
, reserved IN owa_util.vc_arr
);


So, because owa does not use this, I simple overloaded my plsql:calls:


--
-- pleas call me on the browser like this:
--
create or replace procedure p_test(
name_array IN owa_util.vc_arr
, value_array IN owa_util.vc_arr
);

create or replace procedure p_test(
num_entries IN NUMBER
, name_array IN owa_util.vc_arr
, value_array IN owa_util.vc_arr
, reserved IN owa_util.vc_arr
)
is begin
p_test(
name_array
, value_array
)
end;


Apart from all the stuff i mentioned here: mod_owa has some extra functionality still as compared to mod_plsql, such as caching etc.

Anyway, thanks Doug for this terrific contribution. I'm looking forward to show this solution to my customers.

2 comments:

Anonymous said...

Good information!

It's also worth pointing out that you can use the built-in web server, although this will not be as scaleable as an external web-server.

DBMS_EPG allows you to create a DAD which may be accessed in a similar manner to the mod_plsql gateway.

Lots more info in XE forum regarding this.

rpbouman said...

Thanks for the comment.

It's very true of course - OracleXE comes with a builtin HTTPListener (and much more); an architecture that will be quite familiar to most of the folks that have built webapplications with Oracle WebDB.

However, I need one OracleXE instance to host one instance of the application - this corresponds to exactly one of my customer's clients. All of this needs to be served from one generic portal - at least from one specific domain.

So, one webserver linked to the domain that routes the requests through to the pertinent oraclexe instances seemed the most straightforward thing to do.

Another advantage of having one portal that serves as an endpoint for the internet traffic is that you can put that server outside the firewall, and have oraclexe run behind the firewall.

Meanwhile I got some very helpful comments - indeed at the OracleXE forum - that suggest you can also run apache as a proxy server for many oracle xe instances (http://forums.oracle.com/forums/thread.jspa?messageID=1103704).
That would probably give me an equal solution securitywise.

I might try that architecture too, just to see what works out best.

Anyway, thanks again for the suggestions - appreciate it a great deal!

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...