Ancestors()
and Ascendants
to retrieve the full lineage of members. (See: "MDX: retrieving the entire hierarchy path with Ancestors()".)
As you might recall, the immediate reason to write about those functions was to find a pure MDX solution to implement the "Show Parents" / "Hide Parents" functionality offered by OLAP cube browsers. To recap, developers of MDX-based pivot tables face a challenge when rendering the result of a query like this:
SELECT CrossJoin( [Product].[Product].Members, [Measures].[Sales] ) ON COLUMNS, [Time].[Months].Members ON ROWS FROM [SteelWheelsSales]In plain English: Sales of products across months.
The raw result might look something like this:
1968 Ford Mustang | 1958 Chevy Corvette Limited Edition | ...more Classic Cars... | 1997 BMW R 1100 S | 2002 Yamaha YZR M1 | ...more products... | |
---|---|---|---|---|---|---|
Time | Sales | Sales | Sales | Sales | Sales | Sales |
Jan | 1,742 | |||||
Feb | 2,846 | |||||
...more months... | ... | ... | ... | ... | ... | ... |
Jan | 7,499 | ... | 2,254 | 3,222 | ... | |
Feb | 4,518 | 847 | ... | 2,921 | 3,865 | ... |
...more months... | ... | ... | ... | ... | ... | ... |
Jan
, and two rows labeled Feb
. Since we asked for time in months, these labels probably indicate the months January and February. But which "Jan" or "Feb" exactly? Since each year cycles through all the months we need to know the year as well.
The column labels are also challenging to interpret. What kind of thing is a "1968 Ford Mustang", and what kind of thing is a "1997 BMW R 1100 S"? A domain expert might know the former is a Classic car and the latter a motorcycle, but the labels themselves do not make this clear.
We could of course change the query and add members for the
[Time].[Years]
level and the [Product].[Line]
level and wrap the sets in the Hierarchize()
function:
SELECT Hierarchize( CrossJoin( {[Product].[Line].Members ,[Product].[Product].Members}, [Measures].[Sales] ) ) ON COLUMNS, Hierarchize( {[Time].[Years].Members ,[Time].[Months].Members} ) ON ROWS FROM [SteelWheelsSales]This would at least add extra rows and columns respectively, which would appear as a "break" value announcing a list of subsequent items at the lower level: (Please Note: I'm referring to
[Time].[Years]
level as a "higher" level than [Time].[Months]
, whereas the ordinal number of level, also known as the "level number" is actually lower.):
Classic Cars | 1968 Ford Mustang | 1958 Chevy Corvette Limited Edition | ... | Motorcycles | 1997 BMW R 1100 S | 2002 Yamaha YZR M1 | ... | |
---|---|---|---|---|---|---|---|---|
Time | Sales | Sales | Sales | ... | Sales | Sales | Sales | ... |
2003 | 1,514,407 | 62,140 | 11,553 | ... | 397,220 | 37,016 | 27,730 | ... |
Jan | 41,192 | 1,742 | ... | ... | ||||
Feb | 20,464 | ... | 25,784 | 2,846 | ... | |||
... | ... | ... | ... | ... | ... | ... | ... | ... |
2004 | 1,838,275 | 67,155 | 20,145 | ... | 590,580 | 42,138 | 42,483 | ... |
Jan | 122,792 | 7,499 | ... | 41,201 | 2,254 | 3,222 | ... | |
Feb | 137,641 | 4,518 | 847 | ... | 49,067 | 2,921 | 3,865 | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... |
Simply adding members at higher levels could even add more confusion if our query was designed so as to not select all members at the
[Time].[Months]
and [Product].[Product]
levels, but only a selection of particular members. In that case, the values presented for the higher levels will still report the aggregate measure for all children of that member, and not those that happen to be selected in the current query. In this case the numbers may appear not to add up to the total at a higher level.
So, we'd really like an altogether different solution that simply allows us to see the extra labels belonging to the higher level members, without actually seeing extra rows, columns and values corresponding to those higher level members.
In that prior post I also took a look at how open source OLAP cube browsers like Saiku and Pivot4j implement this. It turned out that Saiku actually does add higher level members, and simply filters those out of the final result, whereas Pivot4J uses the
Olap4J
API to retrieve ancestor members (and their captions).
I wanted to solve this problem with a solution that only requires MDX and that does not require adding higher level members (and forcing calculation of those values), and I explained how we can find the captions of higher levels using the
Ascendants()
function. But what I failed to do is to provide are clear, generally applicable recipe that one can apply to any arbitrary MDX query.
I think I have now found a way to do this, and so I'm writing this post to share what I found, in the hope of getting feedback, and maybe help others that face the same challenge.
How to add lineage information to arbitrary MDX COLUMNS
vs ROWS
queries
In this section I will describe a systematic transformation to turn an arbitrary MDX query and add the lineage information.
Before I explain the query transformation, please note that it applies to MDX queries that have both a
COLUMNS
and a ROW
axis in the SELECT
-list, and no other axes. The existence of a so-called "slicer"-axis (which appears in the WHERE
-clause, not in the SELECT
-list) does not affect the recipe and may or may not be present.
-
For each hierarchy on each axis, create a calculated member to retrieve the lineage for whatever member belongs to that hierarchy. (For details on how to do this, please refer to my prior blogpost on
Ancestors()
andAscendants()
.)
If the hierarchy is on theCOLUMNS
-axis, then put this calculated member in whatever is the last hierarchy of theROWS
-axis; Vice versa, if the hierarchy is on theROWS
axis, then put this calculated member in whatever is the last hierarchy of theCOLUMNS
-axis. As member name, you can use some descriptive text like[Lineage of <hierarchy>]
.
(Note that you only need to make a calculated member if there is a lineage at all. For example, the[Measures]
-hierarchy only has one level, and hence there is no lineage. Therefore it is perfectly ok to omit the[Measures]
-hierarchy, since a calculated member to retrieve the lineage of its members simply would not provide any extra information.)
So, for our original input query we have the[Product]
-hierarchy on theCOLUMNS
-axis, and the[Time
-hierarchy on theROWS
-axis; the last hierarchy on theCOLUMNS
-axis is[Measures]
and the last hierarchy on theROWS
axis is[Time]
. So we get:WITH MEMBER [Measures].[Ancestors of Time] AS Generate( Order( Ascendants([Time].CurrentMember), [Time].CurrentMember.Ordinal, ASC ), [Time].CurrentMember.Properties("MEMBER_CAPTION"), "," ) MEMBER [Time].[Ancestors of Product] AS Generate( Order( Ascendants([Product].CurrentMember), [Product].CurrentMember.Ordinal, ASC ), [Product].CurrentMember.Properties("MEMBER_CAPTION"), "," )
-
For all but the last hierarchy on each axis, create a dummy calculated member of the form:
MEMBER [<hierarchy>].[Lineage] AS 1
We need these dummy members to create tuples to hold the functional calculated members that report the lineage. We could have used an existing member of these hierarchies instead, but the advantage of creating a dummy calculated member is that we get to give it a name that clearly indicates its purpose. We don't actually need the value of these members at all, which is why assigned the constant1
.
So, in our example, theCOLUMNS
-axis contains 2 hierarchies, of which[Product]
is the first one, so we create its dummy calculated member:MEMBER [Product].[Lineage] AS 1
Since ourROWS
-axis only contains one hierarchy we don't need to add any dummy calculated members for that. -
For each axis, wrap the original set expression in a
Union()
function. The second argument to thatUnion()
will be the original axis expression. The first argument should be a newly constructed set of our calculated members to report the lineage.
In the case of our example, we only have one hierarchy on each axis for which we want the lineage, so instead of a full blown set, the first argument toUnion()
can be simply a tuple consisting of the[Lineage]
dummy calculated members (that is, if they exist) plus the[Lineage of <hierarchy>]
calculated member that retrieves the lineage. It's rather more difficult to explain than to simply show, so here's the entire query:WITH MEMBER [Measures].[Ancestors of Time] AS Generate( Order( Ascendants([Time].CurrentMember), [Time].CurrentMember.Ordinal, ASC ), [Time].CurrentMember.Properties("MEMBER_CAPTION"), "," ) MEMBER [Time].[Ancestors of Product] AS Generate( Order( Ascendants([Product].CurrentMember), [Product].CurrentMember.Ordinal, ASC ), [Product].CurrentMember.Properties("MEMBER_CAPTION"), "," ) MEMBER [Product].[Lineage] AS 1 SELECT Union( ([Product].[Lineage], [Measures].[Ancestors of Time]), CrossJoin( [Product].[Product].Members, [Measures].[Sales] ) ) ON COLUMNS, Union( [Time].[Ancestors of Product], [Time].[Months].Members ) ON ROWS FROM [SteelWheelsSales]
Lineage | 1968 Ford Mustang | 1958 Chevy Corvette Limited Edition | ... | |
---|---|---|---|---|
Time | Ancestors of Time | Sales | Sales | ... |
Ancestors of Product | Ancestors of Product | All Products,Classic Cars,Autoart Studio Design,1968 Ford Mustang | All Products,Classic Cars,Carousel DieCast Legends,1958 Chevy Corvette Limited Edition | ... |
Jan | All Years,2003,QTR1,Jan | 1,742 | ... | |
Feb | All Years,2003,QTR1,Feb | ... | ||
... | ... | ... | ... | ... |
Jan | All Years,2004,QTR1,Jan | 7,499 | ... | |
Feb | All Years,2004,QTR1,Feb | 4,518 | 847 | ... |
... | ... | ... | ... | ... |
Ancestors of Time
column glued at the front in between the original row labels and the cellset, and one extra Ancestors of Product
row added on top, in between the original column labels and the cellset. Beyond that extra row and column, we find our original cellset, exactly like it was in the original.
Inside the extra column, we find values like
All Years,2003,QTR1,Jan
for the first original Jan
row - in other words, the entire lineage up to the root level of the [Time]
-hierarchy. Note that now the second Jan
row is clearly distinguishable from the first one, since the value in the lineage column there is All Years,2004,QTR1,Jan
Similarly, the extra row contains values like
All Products,Classic Cars,Autoart Studio Design,1968 Ford Mustang
. So for each tuple along the COLUMNS
axis we get the full hierarchy of the [Product]
-hierarchy here in this extra row's values.
Of course, if a GUI tool were to render this in a way that makes sense to the end-user, some post-processing is required to extract the information from the extra column and row. But all that extra information is in one place, and it didn't require any extra but unused calculations or aggregations.
Dealing with multiple hierarchies
Our example query is rather simple, having only one hierarchy on each axis for which we need the lineage. But the approach is generic and works just as well if you have multiple hierarchies on the axes. While the actual query transformation remains essentially the same, I found it useful to add a few intermediate steps to the recipe in case of multiple hierarchies. This makes it easier (I think) to recognize the original query and the different steps of the query transformation transformation.So, let's do the transformation again, but now with this more complex query:
SELECT CrossJoin( CrossJoin( [Product].[Product].Members, [Order Status].Members ), [Measures].[Sales] ) ON COLUMNS, CrossJoin( [Time].[Months].Members, [Markets].[Country].Members ) ON ROWS FROM [SteelWheelsSales]
-
Create a named set for the expression on each axis, and give them some descriptive name like
[Original <axis>]
:WITH SET [Original Columns] AS CrossJoin( CrossJoin( [Product].[Product].Members, [Order Status].Members ), [Measures].[Sales] ) SET [Original Rows] AS CrossJoin( [Time].[Months].Members, [Markets].[Country].Members )
-
Create the calculated members for the lineage as well as the dummy calculated members exactly like I explained earlier for the simple original query. If you like, you can keep the calculated members that are to appear on the
COLUMNS
-axis separate from those for theROWS
-axis, and together with the named sets that are to appear on that axis.
The calculated members that are to appear on theCOLUMNS
-axis (and which will thus report the lineage information for the hierarchies on theROWS
-axis) are:MEMBER [Product].[Lineage] AS 1 MEMBER [Order Status].[Lineage] AS 1 MEMBER [Measures].[Lineage of Time] AS Generate( Order( Ascendants([Time].CurrentMember), [Time].CurrentMember.Level.Ordinal, ASC ), [Time].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) MEMBER [Measures].[Lineage of Markets] AS Generate( Order( Ascendants([Markets].CurrentMember), [Markets].CurrentMember.Level.Ordinal, ASC ), [Markets].CurrentMember.Properties("MEMBER_CAPTION"), ", " )
The calculated members that are to appear on theROWS
-axis (and which will thus report the lineage information for the hierarchies on theCOLUMNS
-axis) are:MEMBER [Time].[Lineage] AS 1 MEMBER [Markets].[Lineage of Product] AS Generate( Order( Ascendants([Product].CurrentMember), [Product].CurrentMember.Level.Ordinal, ASC ), [Product].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) MEMBER [Markets].[Lineage of Order Status] AS Generate( Order( Ascendants([Order Status].CurrentMember), [Order Status].CurrentMember.Level.Ordinal, ASC ), [Order Status].CurrentMember.Properties("MEMBER_CAPTION"), ", " )
-
Because our axes now have multiple calculated members to retrieve the lineage, it makes sense to put those in a set to simplify creation of the set that we want to glue to the original set of the input query. We can name these sets
[Lineage of <axis>]
:SET [Lineage of Rows] AS {[Measures].[Lineage of Time] ,[Measures].[Lineage of Markets]}
andSET [Lineage of Columns] AS {[Markets].[Lineage of Product] ,[Markets].[Lineage of Order Status]}
-
The final step is just as described earlier - using a
Union()
to add the extra calculated members to the original set expression on the axes. The difference with the earlier simple example is that instead of writing out the literal tuples, we now construct one tuple per axis consisting of all the dummy calculated members, which we thenCrossJoin()
with its respective[Lineage of <axis>]
. This then gives use the set that we can use as the first argument toUnion()
. The second argument to the union will of course be the[Original <axis>]
named sets we created for the original axis sets.
So, we get:Union( CrossJoin( ([Product].[Lineage], [Order Status].[Lineage]), [Lineage of Rows] ), [Original Columns] ) ON COLUMNS
andUnion( CrossJoin( [Time].[Lineage], [Lineage of Columns] ), [Original Rows] ) ON ROWS
WITH SET [Original Columns] AS CrossJoin( CrossJoin( [Product].[Product].Members, [Order Status].Members ), [Measures].[Sales] ) MEMBER [Product].[Lineage] AS 1 MEMBER [Order Status].[Lineage] AS 1 MEMBER [Measures].[Lineage of Time] AS Generate( Order( Ascendants([Time].CurrentMember), [Time].CurrentMember.Level.Ordinal, ASC ), [Time].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) MEMBER [Measures].[Lineage of Markets] AS Generate( Order( Ascendants([Markets].CurrentMember), [Markets].CurrentMember.Level.Ordinal, ASC ), [Markets].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) SET [Lineage of Rows] AS {[Measures].[Lineage of Time] ,[Measures].[Lineage of Markets]} SET [Original Rows] AS CrossJoin( [Time].[Months].Members, [Markets].[Country].Members ) MEMBER [Time].[Lineage] AS 1 MEMBER [Markets].[Lineage of Product] AS Generate( Order( Ascendants([Product].CurrentMember), [Product].CurrentMember.Level.Ordinal, ASC ), [Product].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) MEMBER [Markets].[Lineage of Order Status] AS Generate( Order( Ascendants([Order Status].CurrentMember), [Order Status].CurrentMember.Level.Ordinal, ASC ), [Order Status].CurrentMember.Properties("MEMBER_CAPTION"), ", " ) SET [Lineage of Columns] AS {[Markets].[Lineage of Product] ,[Markets].[Lineage of Order Status]} SELECT Union( CrossJoin( ([Product].[Lineage], [Order Status].[Lineage]), [Lineage of Rows] ), [Original Columns] ) ON COLUMNS, Union( CrossJoin( [Time].[Lineage], [Lineage of Columns] ), [Original Rows] ) ON ROWS FROM [SteelWheelsSales]Nou, the query transformation result looks positively daunting. But it's all the result of the mechanical application of just a few rules. In that sense, it's not particularly difficult. The main thing is not to lose focus and not to forget a calculated member or a hierarchy here or there. For the original use-case, getting a query tool to retrieve the data in order to render results with ancestor information it's not really a problem since that can easily do this transformation by just looping through the axis and the hierarchies.
Note that you can use this slightly more elaborate second recipe just as well for a simple query like the one from our first example. It will work just as well and you'll end up functionally with the same result. It's just that for reasons of presentation and explanation it would be better to start with the simpler 3-step recipe before expanding the approach to multiple hierarchies.
No comments:
Post a Comment