About the Series ...
This is the twenty-fifth article of the series, MDX
Essentials. The series is designed to provide hands-on application of the
fundamentals of the Multidimensional Expressions (MDX) language, with
each tutorial progressively adding features designed to meet specific
real-world needs.
For more information about the series in general, as well as
the software and systems requirements for getting the most out of the lessons
included, please see the first article, MDX at First Glance:
Introduction to MDX Essentials.
Note: Service Pack 3 updates are assumed for MSSQL
Server 2000, MSSQL Server 2000 Analysis Services, and the related Books
Online and Samples.
Overview
In this lesson, we will
introduce a function that assists us in the handling of empty cells. In
multidimensional data sets, we are often confronted with empty cells - data is
often sparse in these sets by their very physical nature. Because, as a simple
example, every product might not be sold at every store in every time period
(to cite an instance from the FoodMart2000 sample environment), we will
see empty cells in abundance in a data set that contains intersects of these
dimensions. (Particularly in working with crossjoins of any magnitude, we will
encounter many empty cells, as a general rule.) Empty cells mean nulls,
and nulls can mean incorrect results in using MDX to support analysis in
reporting.
A logical function, ISEMPTY()
returns true the expression to which it is applied evaluates to an empty cell.
As we will see in the practice example we undertake in this article, ISEMPTY()
works ideally with IIF(), a conditional function, to check cells for
empty or not-empty status.
We will consider
elementary uses of the ISEMPTY() function in this article, and then call
it into service in subsequent articles where we require it as a tool to perform
just this sort of check. In keeping with the objectives of the MDX
Essentials series, we will seek to build a foundation in the rudiments
of the function, from which we can expand to more sophisticated uses in other
articles. As a part of building our basis in the ISEMPTY() function, we
will also take a preliminary look at the IIF() function, which we will
take up in subsequent articles that we devote to it especially.
ISEMPTY()
will likely
become a valued member in the toolset of any practitioner that relies heavily
upon MDX to supply solutions to the organizations they support. We will
introduce the function, commenting upon its operation and touching upon uses at
a general level, and then we will:
-
Examine the syntax surrounding the function;
-
Undertake an illustrative example of the use of the function, in a
multi-step practice exercise;
-
Briefly discuss the results datasets we obtain in the practice
examples.
The IsEmpty() Function
Introduction
The ISEMPTY() function, according
to the Analysis Services Books Online, "returns TRUE if the
evaluated expression is the empty cell value, FALSE otherwise." We will
examine the function's manner of accomplishing
these evaluations, and discuss factors that we should consider based upon the
function's behavior, in the sections that follow.
We will
examine the syntax for the ISEMPTY() function in general, building to its
use in meeting an issue that arises in the real world, where "empties get
in the way" of our objectives of clear analysis and reporting. In this
way, we will be able to clearly see that the ISEMPTY () function does,
in fact, generate the results we might expect. Our objective is to gain a richer
understanding of the capabilities found within the ISEMPTY () function,
together with a feel for potential uses of the function. As a byproduct of our
examination of ISEMPTY(), we will introduce a simple use of the IIF()
function, which will serve to overview the basics of IIF(), as a
preliminary for the two articles that follow this one.
Discussion
ISEMPTY () affords us a means of testing for empty
cells. It becomes valuable in light of the fact that sparseness, as we
have noted earlier, is a fact of life in multidimensional data sets, and is
often an obstacle, as we shall discuss, in our analysis and reporting efforts.
And with empty cells come various data presentation problems that we can remedy
easily, in many cases, using ISEMPTY() to detect the empties and manage
them in various ways.
ISEMPTY() is often accompanied by the conditional IIF()
function, as we shall see in our practice section. It often appears, also,
with the NOT keyword, in scenarios where we are attempting to manipulate
cells, in some way, which are not empty.
Let's look at a syntax illustration to further clarify the
operation of IS EMPTY().
Syntax
Syntactically, the expression upon
which the evaluation of "empty or not empty" is to be applied by the ISEMPTY()
function is placed within the parentheses to the right of ISEMPTY. The
syntax is shown in the following string:
ISEMPTY(<<Value Expression>>)
A logical function, ISEMPTY()
returns "True" if the Value Expression is null,
and "False" if it is not null. That is, ISEMPTY()
evaluates the expression we provide in the parentheses to its right, and
returns either a negative one (-1) or a zero (0), depending upon whether the
expression is revealed to be an empty cell or not, respectively.
If we seek to use a tuple,
versus a basic member reference, as the Value Expression, we must
surround the expression by parentheses, as shown in the following string:
ISEMPTY ( ([Measures].[Units Shipped], [Time].NextMember) )
The following simple example illustrates conceptually the
operation of the ISEMPTY() function. (We will be doing a practice
exercise in subsequent sections, but if you want to "test drive" a
sample, you can certainly cut and paste, or type, the below into the MDX Sample
Application).
We will query the Warehouse sample cube to determine,
for operating year 1998, which State / Province locations
recorded no shipping activity for canned vegetables. We will use a
calculated member ("Activity") that indicates whether there
was shipping activity by indicating "None," if there were no
units shipped, and "Activity:" if, indeed, there were. We
will display the number of Units Shipped for the year in a row below the
string indicator, to demonstrate the fact that the calculated member generates
the expected results.
WITH MEMBER
[Measures].[Activity]
AS
'IIF( ISEMPTY( [Measures].[Units Shipped]), "None", "Activity:")'
SELECT
{[Warehouse].[State Province].Members} ON COLUMNS,
CROSSJOIN(
{[Product].[All Products].[Food].[Canned Foods].[Vegetables]},
{[Measures].[Activity], [Measures].[Units Shipped]}) ON ROWS
FROM
[Warehouse]
WHERE
([Time].[Year].[1998])
This query returns a set similar to that depicted in Illustration
1.
Illustration 1: Returned Dataset
for Example Query
We note that the returned data indicates that two Mexican provinces,
Jalisco and Yucatan, apparently experienced no shipping volume
for canned vegetables within the operating year of 1998.
We will activate the concepts involved in the foregoing
discussions by practicing the use of the ISEMPTY() function in the
section that follows.
Practice
Preparation
To reinforce our understanding of the basics we have covered
so far, we will use the ISEMPTY() function in a manner that illustrates
its operation within a multi-step example. We will first create a query that attempts
to meet a business need as stated by a hypothetical group of information
consumers. The resulting dataset will illustrate a scenario in which we might
want to manage an empty underlying cell. We will then expose a means, using
the ISEMPTY() function, to provide the results we desire in the final
presentation of the data.
We will call upon the MDX Sample Application again,
as our tool to construct and execute the MDX we examine, and to view the results
datasets we obtain.
1.
Start the MDX
Sample Application.
2.
Clear the top
area (the Query pane) of any queries or remnants that might appear.
3.
Ensure that FoodMart
2000 is selected as the database name in the DB box of the toolbar.
4.
Select the WAREHOUSE
cube in the Cube drop-down list box.
Let's assume for our practice example that we have received
a request from a group of information consumers in the Logistics unit of
the FoodMart2000 organization. The request is for support in the
presentation of some data, housed within the Warehouse sample cube,
regarding Product shipping volumes for 1998. The consumers wish
to see the total Units Shipped for FoodMart200 products by Product
Family. They also wish to see a second column in the display that presents
a percent of parent value - that is, they want to see the percentages of total
product units shipped that can be attributed to each Product Family.
We will begin by composing a simple query to meet the
business requirement, which will illustrate an effect of an empty cell upon the
display of the data. We will then eliminate the problem in subsequent steps,
after discussing why the problem exists in the first place. My objective is thus
to illustrate use of the function to meet the hypothetical business requirement
we have outlined, as a means of reinforcing the concepts in our minds. Once we
have activated the concepts, they can be triggered upon meeting a similar
situation in our respective business environments.
To initiate action, let's attempt to meet the requirement in
a somewhat intuitive way.
5.
Type the
following query into the Query pane:
-- MDX25-1: Tutorial Query Step 1
WITH MEMBER
[Measures].[% Total Products]
AS
'[Measures].[Units Shipped]/([Measures].[Units Shipped],[Product].Parent)'
SELECT
{[Measures].[Units Shipped], [Measures].[% Total Products]} ON COLUMNS,
{DESCENDANTS([Product].[All Products], [Product].[Product Family],
SELF_AND_BEFORE)} ON ROWS
FROM
[Warehouse]
WHERE
([Time].[Year].[1998])
The
purpose of the WITH MEMBER section of the query is to create a
calculated member to provide the "percent of parent" (that is,
percent of total Product Units Shipped) contributed by each of the Product
Family members, including the total of all Product Families. Reason
tells us that the Total Products line in the data set returned should
represent 100.00 %, and discussions with members of the intended audience
confirms that this is what they wish to see. Herein lies a complication that
rests at the heart of a need for management of empty cells, as we shall see
shortly.
The SELECT
statement simply requests the calculated member, alongside the Units Shipped
measure for each respective Product Family's activity for 1998. At
this stage, we have added no formatting for the calculated member.
The
Results pane is
populated by Analysis Services, and the dataset shown in Illustration 2 appears.
Illustration 2: Initial Results Dataset
Upon initial blush, we can see that
something funky seems to be happening with the calculated member for % Total
Products, but to ascertain how the calculated member is performing overall,
we first need to format it to indicate percentages correctly. What we see
initially is a rounding effect that is making the effectiveness of the
calculated member unclear.
7.
Select File
-> Save As..., name the file MDX25-1,
and place it in a meaningful location. Leave the file open for the next step.
8.
Modify the
comment line to read:
-- MDX25-2: Tutorial Query Step 2
9.
Save the file
as MDX25-2, to protect MDX25-1.
Now we
will take another preparatory step and add formatting into the WITH MEMBER
clause, to make your percentages appear in a manner that is useful to the
information consumers.
10.
Insert the following
string:
, FORMAT_STRING = '#.00%'
between the existing fifth line of the query (counting the
comment line, and shown below):
'[Measures].[Units Shipped]/([Measures].[Units Shipped],[Product].Parent)'
and the existing sixth line of the query, the SELECT
keyword.
The
Query pane
appears as shown in Illustration 3, with changes circled in red.
Illustration 3: Modified Query in Query Pane (Compressed
View)
11.
Execute the
query by clicking the Run Query button in the toolbar, once again.
The
Results pane is
populated once more, and the dataset depicted in Illustration 4 appears.
Illustration 4: Results Dataset, with Percent Formatting
Enhancement
We see the percentages displayed,
filling the blanks that existed before due to undesirable rounding.
12.
Select File
-> Save to save the query as MDX25-2.
Leave the file open for the next step.
13.
Modify the
comment line to read:
-- MDX25-3: Tutorial Query - Final
14.
Save the file
as MDX25-3, to protect MDX25-2.
We have now eliminated the display issue of percentage format, and have only to take the necessary steps to present "100%" as the percent of total on the Total Products line. Here we will use the ISEMPTY() function to remedy the problem (an error indicated in the cell value calculation) that is causing the nonsensical result of "1.#INF" to appear in the field.
The problem that underlies the appearance of the 1.#INF result is division by zero. The expression that makes up our calculated member (inside the WITH MEMBER clause we constructed earlier) is as follows:
Measures].[Units Shipped]/([Measures].
[Units Shipped],[Product].Parent)
The MDX above contains a .Parent function. This reference to the .Parent function works fine in the cases of each of the Product Families, because each of the families has a parent, the All Products level of the Product dimension. Total Products, however, which represents the All Product level itself, has no parent - All Products occupies the highest level in the dimensional hierarchy.
Because it has no parent, the value returned when we subject All Products to the .Parent function is null. Were this simply a reference to [All Products].Parent, an empty cell would be returned. However, when we use the reference within a calculation as a divisor, as we have done above, the result is effectively an attempt to divide by zero: MDX treats "null" as "zero," which causes the underlying error.
To eliminate the error, and to display the accurate value of "100.00%", we need to combine the IIF() function, which is employed to evaluate a conditional expression, with the MDX ISEMPTY() function, which, as we have noted, indicates whether a given value expression (in our case, a divisor of the expression forming our calculated member) is empty.
To meet the expressed needs of the information consumers with which we are working, we will take the following steps:
1. Modify the WITH MEMBER, "AS" clause's contents, currently appearing as:
'[Measures].[Units Shipped]/([Measures].[Units Shipped],[Product].Parent)'
to the following:
'IIF(ISEMPTY(
([Measures].[Units Shipped],[Product].Parent) ), 1,
[Measures].[Units Shipped]/([Measures].
[Units Shipped],[Product].Parent))'
Here we are simply surrounding the original string with An ISEMPTY() function, placing it into the parentheses to the right of the function (recall that, since we are referencing a tuple versus a simple member reference, we need two sets of parentheses: one for the function and one to enclose the tuple).
We enclose the ISEMPTY() function in the IIF() function. Within the resulting combination of the two functions and our references, we are saying, "if the tuple ([Measures].[Units Shipped],[Product].Parent) is empty, provide a 1 (which equates to 100 percent, when formatted); if the tuple is not empty, substitute the results of the calculation [Measures].[Units Shipped] / ([Measures].[Units Shipped],[Product].Parent."
IIF() is restricted to two numeric return values or two string return values (we cannot mix the two in a single use of the function), as we shall see in our next articles. We are working with numeric values here, where IIF() works well also because it cannot return null as a value (null is neither a string nor a numeric).
The Query pane appears as shown in Illustration 5, with additions circled in red.
Illustration 5: Modified Query in Query Pane (Compressed View)
2. Execute the query by clicking the Run Query button in the toobar.
The Results pane is populated, and the dataset shown in Illustration 6 appears.
Illustration 6: Results Dataset, With the IIF() / ISEMPTY() Combination in Place
We can now see that our problem with the spurious data in the % Total Product column is a thing of the past. "100.00 %" is correctly reflected, thanks to our incorporation of conditional logic into the definition of the calculated member.
3. Select File -> Save to save the query as MDX25-3.
4. Exit the Sample Application when ready.
Thus we meet the business requirement through the use of the ISEMPTY() function, demonstrating a scenario where it helps to overcome presentation issues when empty cells are involved. In combination with the IIF() function, ISEMPTY() is a valuable tool we can use to test cells for emptiness, and to handle them when ISEMPTY() returns a positive for the empty state.
Summary ...
In this article, we explored the ISEMPTY() function, whose purpose is to return a "true" if the expression to which it is applied evaluates to an empty cell. We noted that ISEMPTY() is an excellent means of identifying empty cells. We discussed the fact that empty cells are a common fixture of multidimensional sets, and then began a practice example that included the presence of such an empty, together with an undesirable consequence in the presentation of data to information consumers.
In addition to introducing the ISEMPTY() function in an exercise where we might see its value in managing empty cells, we introduced the IIF() function in the same example, combining the two functions to provide for the conditional application of our solution to manage an empty cell, as well as the divide-by-zero error that it brought into being. We examined the syntax surrounding the ISEMPTY() function, before beginning our multi-step practice exercise to illustrate the combined use of IIF() and ISEMPTY(). Moreover, we discussed the application of formatting our presentation to meet the needs of the intended audience. Finally, we discussed the results datasets we obtained in each of the steps of our efforts.
» See All Articles by Columnist William E. Pearson, III
Discuss this article in the MSSQL Server 2000 Analysis Services and MDX Topics Forum.