Showing posts with label i18n. Show all posts
Showing posts with label i18n. Show all posts

Sunday, June 18, 2017

UI5: per-view Internationalization - What about inheritance?

I started to learn UI5 in earnest in september 2016. Quite soon after that I wrote two blog posts about per-view internationalization:

(In case you're wondering: per-view internationalization is the ability to maintain i18n.properties files, which contain translations for human readable texts together with the view code wherein these texts appear, rather maintaining just one giant i18.properties file that contains the translations for any and all internationalized texts that appear throughout the application. I think the benefits of this idea are clear from a developer's point of view. If you're interested in more information about this concept, then please check out the two blog posts I wrote prior)

Per-view i18n and inheritance

The reason for me to revisit the topic is that I found that the method I use to figure out which i18n.properties files to load, doesn't work very well when you're using inheritance. In UI5, inheritance is achieved by calling the extend-method on the constructor you want to inherit from.

To understand why it doesn't work well, we should first define the scope and the desired behavior.

In my case, I'm using inheritence to build controllers for views that are a lot like other existing views, but with some additional features. Most if not all of the behavior (and texts) of the existing view should be copied, and I'm managing that copy by extending the controller of the existing view. It's just that we need some extra things in the view, and these extra things might bring their own internationalization texts along.

So, what we really need is to include not only the i18n files for the extended controller, we also need to load any i18n files that might be created for the views managed by the superclasses of the extended controller. We need to mind the order too: the extended view might choose to override some of the texts defined by a superclass, so the i18n texts that are closer in the chain of inheritance should be given precedence.

A Per-view i18n solution that works with inheritance

The following code will solve this problem:
    _initI18n: function(){
      //first, check if we already constructed the i18n model for this class
      if (this.constructor._i18Model) {
        //we did! Don't do all that work again, just use the existing one.
        this.setModel(this.constructor._i18Model, i18n);
        return;
      }

      //check if the view and controller are in the same directory.
      //if they are not, then we need to take the possibility into account 
      //that the view and the controller might both have their own i18n files.
      var stack = [];
      var controllerClass = this.getMetadata();
      var viewName = this.getView().getViewName();
      if (controllerClass.getName() !== viewName) {
        //if the view name is different from controller name, 
        //then we assume the view may have its own i18n 
        //that overrides those of the controller 
        stack.push(viewName);
      }

      //walk the chain of inheritance up to sap.ui.core.mvc.Controller
      //store each superclass at the front of the stack
      var className, rootControllerClassName = "sap.ui.core.mvc.Controller";
      while (true) {
        className = controllerClass.getName();
        if (className === rootControllerClassName) {
          break;
        }
        stack.unshift(className);
        controllerClass = controllerClass.getParent();
      }

      //walk the stack and create a resourcebundle for each class
      //use it to enhance this class' i18n model.
      stack.forEach(function(className){
        var bundleData;
        if (window[className] && window[className]._i18Model) {
          bundleData = window[className]._i18Model.getResourceBundle();
        }
        else {
          className = className.split(".");
          //snip off the local classname to get the directory name
          className.pop();  
          //add i18n to make the i18n directory name 
          className.push(i18n);
          //add i18n to point to the i18n.properties file(s) 
          className.push(i18n);
          bundleData = {bundleName: className.join(".")};
        }
        
        var i18nModel = this.getModel(i18n);
        if (i18nModel) {
          i18nModel.enhance(bundleData);
        } 
        else {
          i18nModel = new ResourceModel(bundleData);
          this.setModel(i18nModel, i18n);
        }
      }.bind(this));        

      //cache the i18n model for new instances of this class.
      this.constructor._i18Model = this.getModel(i18n);
    }
Note that this code replaces the _initI18n-method that appeared in my prior blog posts on this topic. It is also assumed that this method sits in some abstract base controller, which you'll extend to create actual concrete controllers for your views.

Here are a couple of highlights that explain the new and improved _initI18N-method:
  • Examining the entire chain of inheritance, up until sap.ui.core.mvc.Controller. This is achieved with this snippet:
          var stack = [];
          ...
          var className, rootControllerClassName = "sap.ui.core.mvc.Controller";
          while (true) {
            className = controllerClass.getName();
            if (className === rootControllerClassName) {
              break;
            }
            stack.unshift(className);
            controllerClass = controllerClass.getParent();
          }
    
    The subsequenct forEach-iteration of the stack then constructs the resource bundle to enhance the i18n model in the usual way.

    Note that names of superclasses that are "higher up" in the hierarchy (or put another way: more basal) are stacked in front of subclasses. This way, the forEach-array method will encounter the class names in the desired order, allowing the subclasess to override texts added by superclasses.
  • Distinguish between texts defined by the view and the controller.

    Admittedly this scenario is quite rare, but if the controller and view each define their own i18n files, then we'd like to enhance our i18n model with files from both resourcebundles. I somehwat arbitrarily decided that in this case, the view's texts should probably override those of the controller.

    I achieve this with this piece of code:
          var stack = [];
          var controllerClass = this.getMetadata();
          var viewName = this.getView().getViewName();
          if (controllerClass.getName() !== viewName) {
            //if the view name is different from controller name, 
            //then we assume the view may have its own i18n 
            //that overrides those of the controller 
            stack.push(viewName);
          }
    
    In other words, the view name is placed as very last item of the stack; if it differs from the controller name, then the controller name appears directly before it, making the view resource bundle the last to enhance our i18n model.
  • Caching the i18n model at the class level, so that every instance may reuse it.

    While I fixed the inheritance issue, it occurred to me that all instances of the controller would go through their own cycle of building the i18n model. Since the i18nmodel deals almost only with static texts, it seemed wasteful to repeat all that work for each instance. We can simply store the i18n model as a property of the constructor, and retrieve it any time we're creating a new instance.

    This is achieved with the very first and last bits of the _initI18n-method:
          //first, check if we already constructed the i18n model for this class
          if (this.constructor._i18Model) {
            //we did! Don't do all that work again, just use the existing one.
            this.setModel(this.constructor._i18Model, i18n);
            return;
          }
          ...
          ...
          ...
          //cache the i18n model for new instances of this class.
          this.constructor._i18Model = this.getModel(i18n);
    

Finally

I hope you enjoyed this post. Let me know and drop a line!

Thursday, September 22, 2016

SAP UI5: Internationalization for each view - Addendum for Nested Views

After writing my previous post on SAP UI5: Per-view Internationalization, I found out that the solution does not work completely as intended when using nested views.

If you're using nested views, each view would still have its own set of unique texts that are entirely specific to just that view, and for those cases, the solution as described still works. But there might also be a number of texts that are shared by both the outer and one or more of the inner views. It would make sense to be able to define those texts in the i18n model at the level of the outer view, and have the i18n models of the nested view pick up and enhance the i18n model of the outer view.

Problem: onInit() is not the right place to initialize the i18n model

The problem with the original solution is that the onInit() method of the nested views gets called before that of the outer view. It makes sense - the whole can be initialized only after its parts have been initialized. But this does mean that the onInit() method is not the right place to initialize the i18n model.

Please consider these lines from the _initI18n() method that I proposed to initialize the i18n model:

      //Use the bundledata to create or enhance the i18n model
      var i18nModel = this.getModel(i18n);
      if (i18nModel) {
        i18nModel.enhance(bundleData);
      }
      else {
        i18nModel = new ResourceModel(bundleData);
      }

      //set this i18n model.
      this.setModel(i18nModel, i18n);

Suppose this code runs as part of a nested view's onInit(). The call to getModel() will try to acquire the i18n model that is already set, or else the i18n model of the owner component. That's how the getModel() method in the base controller works (please see my previous blog post to review that code).

Now, at this point, no i18n model has been set for the view, and so the owner component's i18n model will be picked up. The i18n model of the outer view will however never be found, since the onInit() of the controller of the outer view has not been called yet (and therefore, its _initI18n() has not been called either).

Solution: Use onBeforeRendering() rather than onInit()

It turns out that this can be solved by calling the _initI18N() method in the onBeforeRendering() method rather than in the onInit() method. While nested views are initialized before initializaing the outer view, it's the other way around for the rendering process. This makes sense: as the outer view is being rendered, this requires rendering of its containing views. So the onBeforeRendering() method of the outer view will be called before the onBeforeRendering() method of its nested views. (It's the other way around for onAfterRendering(): outer views will be done rendering after its containing views are rendered).

Ensure i18n initialization occurs only once

There is one extra consideration in moving the i18n initialization from the onInit() to onBeforeRendering(). The reason is that views may go trough multiple rendering cycles, whereas the onInit() will only run once. If there are repeated rendering cycles, we do not want to reinitialize the i18n model, so we add a lock that ensures the i18n model is initialized only once:

    ...
    onInit: function(){
      this._initI18n();
    },
    onBeforeRendering: function(){
      this._initI18n();
    },
    _i18nInitialized: false,
    _initI18n: function(){
      if (this._i18nInitialized === true) {
        return;
      }
      var i18n = "i18n";

      //create bundle descriptor for this controllers i18n resource data
      var metadata = this.getMetadata();
      var nameParts = metadata.getName().split(".");
      nameParts.pop();
      nameParts.push(i18n);
      nameParts.push(i18n);
      var bundleData = {bundleName: nameParts.join(".")};

      //Use the bundledata to create or enhance the i18n model
      var i18nModel = this.getModel(i18n);
      if (i18nModel) {
        i18nModel.enhance(bundleData);
      }
      else {
        i18nModel = new ResourceModel(bundleData);
      }

      //set this i18n model.
      this.setModel(i18nModel, i18n);
      this._i18nInitialized = true;
    },
    ...

Overriding onBeforeRendering() in extensions of the base controller

And of course, when extending the base controller, you'll need to remember to call the onBeforeRendering() method of the ascendant when overriding the onBeforeRendering() method:

sap.ui.define([
  "just/bi/apps/components/basecontroller/BaseController"
], function(Controller){
  "use strict";
  var controller = Controller.extend("just.bi.apps.components.mainpanel.MainPanel", {
    onBeforeRendering: function(){
      Controller.prototype.onBeforeRenderingcall(this);
      ...
    }
  });
  return controller;
});

Finally

I hope you enjoyed this addendum. Feel free to share your insights if you think there is a better way to do handle i18n.

Sunday, September 18, 2016

SAP UI5: Per-view Internationalization

NOTE: There is an addendum to this blog post that suggests a number of improvements. You can check out the addendum here: SAP UI5: Internationalization for each view - Addendum for Nested Views.

Quite recently, I dove into SAP UI5 development. To educate myself, I followed a workshop and I used the Walkthrough.

During my explorations, I ran into a particular issue which I didn't see very readily addressed. I also found a solution for this particular issue, and even though I still have a ton to learn, I think it is worth sharing. So, here goes:

The Feature: Translatable texts and the i18n model

One of the SAP UI5 features highlighted in the Walkthrough is treatment of translatable texts. In the walkthrough this is realized by setting up a resource model, the i18n model.

The i18n model is sourced form i18n .properties files, which are essentially lists of key/value pairs, one per line, and separated by an equals sign:

# Each line is a key=value pair.
greetingAction=Say Hello
greeting=Hello {0}!

To actually setup a i18n model using these texts, you can explicitly instantiate a sap.ui.model.resource.ResourceModel:

sap.ui.define([
    "sap/ui/core/mvc/Controller",
    "sap/ui/model/resource/ResourceModel"
], function(Controller, ResourceModel){
    "use strict";
    return Controller.extend("my.app.App", {
        onInit: function(){
            var i18nModel = new ResourceModel({
                bundleName: "just.bi.apps.JustBiApp.i18n.i18n"
            });
            this.getView().setModel(i18nModel, "i18n");
        }
    });
});

Or, you can have your application instantiate the model by listing it in the models property of the sap.ui5 entry in the manifest.json application descriptor file:

"models": {
 "i18n": {
   "type": "sap.ui.model.resource.ResourceModel",
    "settings": {
      "bundleName": "just.bi.apps.JustBiApp.i18n.i18n"
    }
  }
}

In many cases, the text is required for the static labels of ui elements like input fields, menus and so on. Inside a view, static texts may be retrieved from the i18n model through special data binding syntax, like so:

<-- Button text will read "Say Hello" -->
<Button text="{i18n>greetingAction}"/>

Texts may also be retrieved programmatically inside controller code by calling the .getText() method on the resource bundle object. The resource bundle object is may be obtained from the i18n resource model with the getResourceBundle() getter method:

var bundle = this.getModel("i18n").getResourceBundle();
var text = bundle.getText("greeting", ["World"]);      // text has value "Hello, World!"

Now, the cool thing is that you can write a separate i18n .properties file for each locale that you want to support. The framework discovers which locale is required by the client and use that to find the best matching i18n files appropriate for the client's locale.

The file name is used to identify to which language and/or locale the texts inside the file apply. For example, you'd put the German texts in a i18n_de.properties file, and the English texts in a i18n_en.properties file, and if you want to distinguish between British and American English, you'd create both a i18n_en_GB.properties and i18n_en_US.properties file.

(I haven't found out to exactly which standard the sap ui i18n .properties files apply, but from what I've seen so far I think it's safe to assume that you can use the two-letter lowercase ISO 639-1 code for the language and the two-letter uppercase ISO 3166-1 code for the country)


The Problem: One i18n Model for entire application

Now, the walkthrough demonstrates the feature by adding one i18n model for the entire application so that it becomes available in any of the views that make up the application. I appreciate that the walkthrough is not the right place to cover all kinds of more advanced scenarios, so I can understand why it settles for just one application-wide i18n model.

However, I can't help but feeling this is not an ideal approach. Main reason is that it seems at odds with the fact that many texts are specific to just one particular view. This challenges both the development workflow as well as the reusability of our application components:


  • Each time you create or modify a particular view, you also have to edit the global i18n .properties files. To keep things manageable, you will probably invent some kind of view-specific prefix to prefix the keys pertaining to that view, and you'll probably end up creating a view-specific block in that i18n file. At some point, you'll end up with a lot of lines per i18n file, which is not so maintainable
  • Suppose you want to reuse a particular view in another application. Contrary to the practice used in the Walthrough, I like to keep view and associated Controller together, and in a folder separate from any other view and controller. This way I can easily copy, move or remove the things that belong together. Except that the texts, which also belong to that view/controller, are in the global i18n .properties file, and need to be managed separately.

The Solution: Keep per-view i18n files near view and controller code

The solution I found is to create a i18n subfolder beneath the folder that contains my Controller and View. Since I already keep each associated view and controller together, and separate from the other views and controllers, this approach makes sense: It's just one step further in keeping code and resources that depend directly on each other physically together.


So, this is what my file and folder structure looks like:

FolderStructure.png

So, the webapp folder is the root of the sap ui5 project. The components folder is where I keep subfolders for each functional unit (i.e. View+Controller+resources) of the application. In the picture, you see two such subfolders, basecontroller (more about that below) and mainpanel.

The mainpanel folder is the one that contains an actual component of my application - a MainPanel View, and its controller (in MainPanel.view.xml and MainPanel.controller.js respectively). Here we also find the i18n folder specific to this view, and inside are the i18n .properties files (one for each locale we need to support).

In order to load and apply the view-specific i18n .properties files, I'm using generic extension of sap.ui.core.mvc.Controller which loads the "local", view-specific i18n resource bundle. This extension is called BaseController and is in the basecontroller folder. Here's the code:

sap.ui.define([
  "sap/ui/core/mvc/Controller",
  "sap/ui/model/resource/ResourceModel"
], function(Controller, ResourceModel){
  "use strict";
  var controller = Controller.extend("just.bi.apps.components.basecontroller.BaseController", {
    onInit: function()
      this._initI18n();
    },
    _initI18n: function(){
      var i18n = "i18n";

      //create bundle descriptor for this controllers i18n resource data
      var metadata = this.getMetadata();
      var nameParts = metadata.getName().split(".");
      nameParts.pop();
      nameParts.push(i18n);
      nameParts.push(i18n);
      var bundleData = {bundleName: nameParts.join(".")};

      //Use the bundledata to create or enhance the i18n model
      var i18nModel = this.getModel(i18n);
      if (i18nModel) {
        i18nModel.enhance(bundleData);
      }
      else {
        i18nModel = new ResourceModel(bundleData);
      }

      //set this i18n model.
      this.setModel(i18nModel, i18n);
    },
    getModel: function(modelname){
      var view = this.getView();
      var model = view.getModel.apply(view, arguments);
      if (!model) {
        var ownerComponent = this.getOwnerComponent();
        if (ownerComponent) {
          model = ownerComponent.getModel(modelname);
        }
      }
      return model;
    },
    setModel: function(model, modelName){
      var view = this.getView();
      view.setModel.apply(view, arguments);
    },
  });
  return controller;
});

Note how the BaseController initializes the i18 model by calling the _init118n() method. In this method we extract the className of the this object from its metadata (using the .getName() getter on the metadata obtained using the .getMetadata() getter), and we pop off the unqualified classname to obtain its namespace. We then add the string "i18n" twice - once for the folder, and once for the files inside it. We use this to create the bundlename which we use to instantiate the actual ResourceModel.

Before setting that model to the controller's view, we check if there is already an i18n model set using getModel(). This is a utility method that gets the model from this controller's associated view, or of the component that "owns" the view and this controller.

If a i18n model is already available, we enhance that by calling the .enhance() method on it, rather than replacing it. This way, any texts defined at a higher level are still available in this controller and view. This gives us a functional i18n model, which we then set using setModel(), which simply calls setModel() on the view associated with this controller.

To actually use this BaseController, we extend it when creating a "real" controller:

sap.ui.define([
  "just/bi/apps/components/basecontroller/BaseController"
], function(Controller){
  "use strict";
  var controller = Controller.extend("just.bi.apps.components.mainpanel.MainPanel", {
    onInit: function(){
      Controller.prototype.onInit.call(this);
      ...
    }
  });
  return controller;
});

Note that if that real controller has its own onInit() method, we need to first call the onInit() method of BaseController, or rather, of whatever class we're extending. Fortunately, since we are extending it we already have a reference to it (in the snippet above, it's the Controller parameter injected into our definition), so we call its onInit() method via the prototype while using the controller that is currently being defined (this) as scope.

Finally

I hope you enjoyed this article! I hope it will be of use to you. If you have an alternative solution - quite possiby, a better one - then please feel free to leave a comment and point out your solution. I'm eager to hear and learn so don't hesitate to share your opinion and point of view.

ADDENDUM: Nested views

It turns out the approach described in this post does not work well when using nested views. Fortunately, a simple improvement of the ideas in this post solves that problem. The approach is described in my next blog post, SAP UI5: Internationalization for each view - Addendum for Nested Views.

Tuesday, September 04, 2007

Kettle Tip: Using java locales for a Date Dimension

The Date dimension is a well known construct in general data warehousing. In many cases, the data for a date dimension is generated using a database stored procedure or shell-script.

Another approach to obtain the data for a date dimension is to generate it using an ETL tool like Pentaho Data Integration, a.k.a. Kettle. I think this approach makes sense for a number of reasons:

  • When you tend to use a particular ETL tool, you will be able to reuse the date dimension generator over an over, and on different database platforms.

  • You won't need special database privileges beyond the ones you need already. Privileges for creating tables and to perform DML will usually be available, whereas you might need to convince a DBA that you require extra privileges to create and execute stored procedures.

In addition to these general considerations, you can pull a neat little trick with Kettle to localize the data and format of the date attributes. I wouldn't go as far as to say that this feature is Kettle specific: rather, it relies on the localization support built into the java platform and the way you can put that to use in Kettle transformations.

Prerequisites


In this tip, the steps to create a date dimension are described using Kettle 2.5.1 (Generally available Release) and MySQL 5.1.20 (Beta). You will be able to follow through the example using earlier (and later) versions of both products though - I am not using any functionality that is specific to these particular version of the products. The recipe does not really require that you understand anything about data warehouses or date dimensions, but you will probably appreciate it better if you do ;)

Overview


The transformation to generate the data for the date dimension follows a pretty straightforward design. The graphical representation of the transformation is shown below:

localized_date_dimension1

First, the dimension table is created (Prepare). After that, rows are generated to fill it (Input). However, the generated rows are almost empty and barren - we still need to derive and add data to fill the attributes of the date dimension (Transformation). Finally, the data is stored in the date dimension table (Output).

Step-by-Step


The remainder of this article describes in detail how to build this transformation. The majority of steps is probably not very interesting to moderately experienced Kettle users, but may be of use to beginning users.

Note for users that are completely new to Kettle - it is advisable to review the first few chapters of the Spoon user guide (Spoon is the name of Kettle tool you use to design the ETL process). It explains how to start up the tool, create a new transformation, add and connect steps etc. You can find it in the docs/English directory beneath the Kettle home directory.

MySQL JDBC driver: setting the characterEncoding property to UTF8


You need to create a (JDBC) connection to MySQL in the usual, straightforward way:

kettle-mysql-connection

In addition, you need to set the characterEncoding property of the JDBC driver:

kettle-mysql-connection-utf8

This ensures MySQL will be able to understand the utf8 encoded data that we may produce to generate a date dimension in the, say, Chinese language. Note that you cannot just use a statement like SET NAMES utf8 to do this. This is not specific to Kettle, but has to do with the way the MySQL JDBC driver (Connector/J) handles character sets. Please refer to the "Using character sets and unicode" section of the Connector/J documentation for more information on this topic.

Creating the date dimension table

In this particular case, it seemed convenient to create the dimension table as part of the transformation. This is done using the "Execute SQL Script" step shown below:

kettle-sql-script

The "Execute SQL Script" step executes the following script to create the date dimension table:
DROP TABLE IF EXISTS dim_date
;
CREATE TABLE IF NOT EXISTS dim_date (
date_key smallint unsigned NOT NULL,
date date NOT NULL,
date_short char(12) NOT NULL,
date_medium char(16) NOT NULL,
date_long char(24) NOT NULL,
date_full char(32) NOT NULL,
day_in_year smallint unsigned NOT NULL,
day_in_month tinyint unsigned NOT NULL,
is_first_day_in_month char(10) NOT NULL,
is_last_day_in_month char(10) NOT NULL,
day_abbreviation char(3) NOT NULL,
day_name char(12) NOT NULL,
week_in_year tinyint unsigned NOT NULL,
week_in_month tinyint unsigned NOT NULL,
is_first_day_in_week char(10) NOT NULL,
is_last_day_in_week char(10) NOT NULL,
month_number tinyint unsigned NOT NULL,
month_abbreviation char(3) NOT NULL,
month_name char(12) NOT NULL,
year2 char(2) NOT NULL,
year4 year NOT NULL,
quarter_name char(2) NOT NULL,
quarter_number tinyint NOT NULL,
year_quarter char(7) NOT NULL,
year_month_number char(7) NOT NULL,
year_month_abbreviation char(8) NOT NULL,
PRIMARY KEY(date_key),
UNIQUE(date)
)
ENGINE=MyISAM
DEFAULT CHARACTER SET utf8
DEFAULT COLLATE utf8_unicode_ci
This is by no means a complete date dimension. The most important limitation is that it only contains attributes that are immediately derivable from the calendar. So, attributes to denote business specific periods like the fiscal year and holidays are not included.

Generating 10 years worth of days

The grain of the date dimension is days - a row in the date dimension represents a single day. In this case, the "Generate Rows" is configured to generate 3660 rows, which roughly corresponds with enough days to last 10 years:

kettle-generating-10-years

In the example, this step is also used to provide parameters to generate the date dimension data. As we'll see in a moment, the inital_date field effectively specifies the first date that goes into the date dimension. The language_code and country_code fields are used to localize the textual attributes of the date dimension, and the local_yes and local_no fields are used for boolean attributes.

There are other ways to get these parameters into our transformation. For example, we could have used an "Add Constants" step with a similar result. Another possibility would be to get this data from the environment using a "Get Variables" step, and this would allow the parameters to be specified at transformation run-time.

Counting the days

Although we certainly generate enough rows, they are all identical. In order to have each row represent a single distinct day, we need a way to 'count' the generated rows. We do this by adding a "Sequence" step:

kettle-adding-sequence

In this case, we use the "Add sequence" step to generate an incrementing number within the scope of the transformation. As we'll see later on, we can add this to our initial date to get a series of consecutive dates.

Calculating date dimension Attributes

The previous steps form a basis from which we can derive all of the attributes that currently make up our date dimension. To actually calculate the date attributes, we use a "Modified Java Script Value" step:

kettle-javascript

Kettle comes with an embedded Rhino javascript engine. The "Modified Java Script Value" step lets you use it to run javascript code to as part of the transformation.

The javascript code is executed for each row that comes out of the previous steps. In the script code, one can reference the values from the input rows, perform some processing on them, and generate new output fields.

One of the fortunate characteristics of the Rhino engine is that it lets us use java classes inside the javascript code. Let's take a look at the script to see how we can use that to generate the localized data for our data dimension attributes.
Initialization
The first thing we do in the javascript code is to get data from the current input row. In the "Generate Rows" step, we added the language_code and country_code fields to specify a locale. Here, in the script, we use the following piece of code to turn that into a java.util.Locale object:
//Create a Locale according to the specified language code
var locale = new java.util.Locale(
language_code.getString() //get the ISO639 language_code from the input row
, country_code.getString() //get the ISO3166 country_code from the input row
);
The java.util.Locale class represents a particular cultural region. It forms a cornerstone of the internationalization support built into the java platform, and provides information to many other classes to generate appropriately localized output.

We will be using the locale on a number of occasions, but first, we use our it to initialize a java.util.Calendar object:

//Create a calendar, use the specified locale
var calendar = new java.util.GregorianCalendar(locale);
(Note that the java platform currently only provides one concrete Calender Class: the java.util.GregorianCalendar. Unfortunately, java does not seem to provide a built-in recipe for dealing with, for example, Islamic or Hebrew calendars).

We require the calendar object to obtain an instance of the java.util.Date Class that represents the date corresponding to the current row. To do that, we first set the calendar's current date using the initial_date field that was specified in the "Generate Rows" step:
//Set the initial date
calendar.setTime(initial_date.getDate());
We need this to add the number of days generated by our "Add Sequence" step:
//set the calendar to the current date by adding DaySequence days
calendar.add(calendar.DAY_OF_MONTH,DaySequence.getInteger() - 1);

(Note that we substract 1 from the DaySequence value. This is because our sequence starts at 1, and we want the specified initial date to be included in our date dimension).

We conclude the initialization of the script by retrieving a java.util.Date object that represents the date for the current row.
//get the calendar date
var date = new java.util.Date(calendar.getTimeInMillis());
This java.util.Date instance is assigned to the date variable in the script, allowing it to be used as an output field of the javascript step. We require this in order to fill the date column of the date dimension table. We will also be using the date variable later on in this script to derive the value of other date dimension attributes.
Getting Text representations of full dates
Our date dimension has a number of attributes to denote a complete date containing day, month and year parts, in various formats: date_short, date_medium, date_long and date_full. These are all generated using the java.text.DateFormat class.

To do that, we first need to create an appropriate DateFormat instance using the static getDateInstance() method, passing our locale object as well as a constant that specifies whether we want to short, medium, long or full format. Then, we can pass the java.util.Date object for which we want to obtain the textual representation to the format method of the newly created java.text.DateFormat instance:
//en-us example: 9/3/07
var date_short = java.text.DateFormat.getDateInstance(
java.text.DateFormat.SHORT
, locale
).format(date);
//en-us example: Sep 3, 2007
var date_medium = java.text.DateFormat.getDateInstance(
java.text.DateFormat.MEDIUM
, locale
).format(date);
//en-us example: September 3, 2007
var date_long = java.text.DateFormat.getDateInstance(
java.text.DateFormat.LONG
, locale
).format(date);
//en-us example: Monday, September 3, 2007
var date_full = java.text.DateFormat.getDateInstance(
java.text.DateFormat.FULL
, locale
).format(date);
Formatting date parts
Extracting and formatting different date parts is most easily done by applying the format function on a subclass of java.text.Dateformat, the java.text.SimpleDateFormat class. The java.text.SimpleDateFormat class allows formatting of dates based on date and time patterns:
//day in year: 1..366
var simpleDateFormat = java.text.SimpleDateFormat("D",locale);
var day_in_year = simpleDateFormat.format(date);
In this example, we pass both the locale and a date pattern to the constructor to create an instance of the java.text.SimpleDateFormat class. The pattern is passed as the string "D", specifying a day-in-year format.

Once we created the java.text.SimpleDateFormat instance, we can apply a new pattern to it using the applyPattern() method. Calling the format method again, we obtain the date in the desired format:
//day in month: 1..31
simpleDateFormat.applyPattern("d");
var day_in_month = simpleDateFormat.format(date);
//en-us example: "Monday"
simpleDateFormat.applyPattern("EEEE");
var day_name = simpleDateFormat.format(date);
//en-us example: "Mon"
simpleDateFormat.applyPattern("E");
var day_abbreviation = simpleDateFormat.format(date);
//week in year, 1..53
simpleDateFormat.applyPattern("ww");
var week_in_year = simpleDateFormat.format(date);
//week in month, 1..5
simpleDateFormat.applyPattern("W");
var week_in_month = simpleDateFormat.format(date);
//month number in year, 1..12
simpleDateFormat.applyPattern("MM");
var month_number = simpleDateFormat.format(date);
//en-us example: "September"
simpleDateFormat.applyPattern("MMMM");
var month_name = simpleDateFormat.format(date);
//en-us example: "Sep"
simpleDateFormat.applyPattern("MMM");
var month_abbreviation = simpleDateFormat.format(date);
//2 digit representation of the year, example: "07" for 2007
simpleDateFormat.applyPattern("y");
var year2 = simpleDateFormat.format(date);
//4 digit representation of the year, example: 2007
simpleDateFormat.applyPattern("yyyy");
var year4 = simpleDateFormat.format(date);
Dealing with Quarters
Although the java.text.SimpleDateFormat class is useful, it does not provide any functionality for working with quarters. We do want our date dimension to contain attributes to represent the quarter, so we have to reside to computing these manually:
//handling Quarters is a DIY
var quarter_name = "Q";
var quarter_number;
switch(parseInt(month_number)){
case 1: case 2: case 3: quarter_number = "1"; break;
case 4: case 5: case 6: quarter_number = "2"; break;
case 7: case 8: case 9: quarter_number = "3"; break;
case 10: case 11: case 12: quarter_number = "4"; break;
}
quarter_name += quarter_number;
Although this will do for now, this solution doesn't really cut it because it does not produce localized output. Anyway, it is better than nothing so we'll just have to make do with it.
Period demarcation flags
Our date dimension has a few attributes that are used to indicate the start and end of week and month periods. We use simple yes/no type flags, but we allow the actual "yes" and "no" values to be specified by the user in the "Generate Rows" step. We retrieve them with the following piece of code:
//get the local yes/no values
var yes = local_yes.getString();
var no = local_no.getString();
We can now use them these to flag the start and end of week and month periods.

The start (and of course, also the end) of the week are subject to the locale. In order to find out if we are dealing with the first day of a week, we use the getFirstDayOfWeek() method of the java.util.Calendar class. By comparing its return value with the day of week of the current row, we can see if we happen to be dealing with the first day of the week:
//initialize for week calculations
var first_day_of_week = calendar.getFirstDayOfWeek();
var day_of_week = java.util.Calendar.DAY_OF_WEEK;

//find out if this is the first day of the week
var is_first_day_in_week;
if(first_day_of_week==calendar.get(day_of_week)){
is_first_day_in_week = yes;
} else {
is_first_day_in_week = no;
}
Note that we obtain the current day of the week by passing the value of the DAY_OF_WEEK constant to the get method of the java.util.Calendar object that we initialized at the start of the script.

In order to set the value for the is_last_day_in_week attribute of the date dimension, we simply find out if the next day happens to be the first day of the week. If it is, then by definition, the current row represents the last day of the week:
//calculate the next day
calendar.add(calendar.DAY_OF_MONTH,1);

//get the next calendar date
var next_day = new java.util.Date(calendar.getTimeInMillis());

//find out if this is the first day of the week
var is_last_day_in_week;
if(first_day_of_week==calendar.get(day_of_week)){
is_last_day_in_week = yes;
} else {
is_last_day_in_week = no;
}
(Note that we have already used similar code to add a day to a date when we added the day sequence to the initial date.)

We can use similar logic to calculate the values for the is_first_day_in_month and is_last_day_in_month indicators. This is actually easier, because the first day in the month is not dependant upon the locale (at least - not within one calendar). So, we only need to find out if the day of month is equal to one:
//find out if this is the first day of the month
var is_first_day_of_month;
if(day_in_month == 1){
is_first_day_in_month = yes;
} else {
is_first_day_in_month = no;
}

//find out if this is the last day in the month
var is_last_day_of_month;
if(java.text.SimpleDateFormat("d",locale).format(next_day)==1){
is_last_day_in_month = yes;
} else {
is_last_day_in_month = no;
}
A few more date attributes
We conclude the computation of the date attributes by adding a few more useful labels:
//a few useful labels
var year_quarter = year4 + "-" + quarter_name;
var year_month_number = year4 + "-" + month_number;
var year_month_abbreviation = year4 + "-" + month_abbreviation;
Like when we calculated the quarters, this is actually not a very good method because the results will not be localized. That said, the result will make sense for many locales, and we don't really have a better way to deal with it right now.
Defining the step outputs
We just calculated all the required values to fill the attributes of our date dimension. We just need to get them out of the script and into the outputs of the step.

Every variable declared in the javascript (using the var keyword) can be used as an output field of the javascript step. The easiest way to generate the outputs is by hitting the "Get Variables" button at the bottom of the dialog. This simply adds an output field for each variable declared in the script:

kettle-javascript-outputs

By default, the data type for all the outputs added in this way is set to the String type. Although it is good practice to choose a more specific data type, it is almost always unncessary in this case, as all integer type values will be correctly converted implicitly when we insert them into the database. There is one exception in this case, and that is the date output. Inside the script, it is an instance of a java.util.Date class, and we must set the type to "Date" in the output too. Otherwise, the (java) string representation of the java.util.Date object will be sent as output, and this is not automatically recognized as a date by MySQL.

Discarding Fields

We are now almost ready to insert the rows into the date dimension table. We only need to discard all fields in the stream that do not correspond with any of the columns in our date dimension table. We use a "Select Values" step to do that:

kettle-select-values

We use the "Get Fields To Select" button to pull in all available fields, and after that, simply select and delete each field that we do not need. As a final step, we rename the DaySequence field to date_key to map it to the date_key column in our date dimension table.

Inserting data into the table

In the final step, we add the generated data to the dim_date table we created in the very first step of the transformation:

kettle-table-output

We only need to specify the connection and the table name here, and the step will then automatically attempt to map the fields of the incoming rows to table columns.

We could have used the "Insert / Update" step, or even the "Execute SQL Script" step too to write the data to the dimension table, but that would require a little bit of extra work.

Running the transformation


After building the transformation, you can run it by hitting the "running man" icon on the toolbar. This will open a dialog where you can set a number of properties for the transformation. Hit "Launch" button there and after that, the transformation will be executed:

kettle-run-transformation

Closing Notes


I hope you enjoyed this tip. If you want to, you can download the kettle transformation here, and use it as you see fit.

If you are interested in open source data warehousing, register for the MySQL Enterprise Data Warehousing Seminar, Thursday, September 06, 2007 and hear what Robin Schumacher has to say about that subject. (Note that this is a general MySQL data warehousing seminar - this post and the seminar are unrelated)

DataZen winter meetup 2025

The DataZen winter meetup 2025 is nigh! Join us 18 - 20 February 2025 for 3 days of expert-led sessions on AI, LLM, ChatGPT, Big Data, M...