About the Series ...
This is the sixth tutorial article of the
series, MDX in Analysis Services. The series is designed to
provide hands-on application of the fundamentals of MDX from the perspective of
MS SQL Server 2000 Analysis Services ("Analysis Services,");
our primary focus is the manipulation of multidimensional data sources, using
MDX expressions in a variety of scenarios designed to meet real-world business
intelligence needs.
For more information on the series, as well as the hardware /
software requirements to prepare for the tutorials we will undertake, please
see Tutorial 1: MDX Concepts and Navigation.
Note: At the time of writing, Service
Pack 3 updates
are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis
Services, and the related Books Online and Samples.
Introduction
In our
last tutorial, we expanded upon the
topics we introduced in our previous lesson, Retrieve Data from Multiple Cubes. Introducing
the concept of time within the context of expression design, we
worked through examples of providing support for time-based analysis, such as
the quantification of change over time in values, with MDX functions
designed for that purpose.
In
this lesson, we will begin an exploration of MDX
Sets. We will need to have a sound basis in MDX query fundamentals to do
so, so we will examine the makeup and operation of MDX queries as a preparatory
step. We will then move into our examination of MDX Sets. We will:
-
Discuss the differences between
MDX expressions and MDX query statements;
-
Create rudimentary MDX
queries to gain an understanding of their components and uses;
-
Explore MDX sets, their
functional uses, and the steps involved in their creation;
-
Practice the use of set
functions as a part of working with MDX sets.
Introducing MDX
Queries
The first order of business
in this tutorial will be to gain a good understanding of MDX query
statements, and to understand how they differ from the multidimensional
expressions with which we have worked through our last lesson. As we have
noted in past tutorials, an expression calculates a single
value. An expression can be used within various applications, among them
Excel PivotTable Reports or an Office PivotTable List, to retrieve values from
an OLAP data source. In contrast, a query is used to extract multiple
values from a cube. Queries underlie the reporting that we might do in the PivotTable
components, and are often largely invisible to us when we develop our reports
visually with these and other tools.
We often
juxtapose numerous expressions, and the values they retrieve, to obtain a
multidimensional value for analysis. We will learn to create increasingly
sophisticated MDX queries in this and prospective tutorials, and will be
able to understand the underlying query components that are generated in the
background for us when we create MDX expressions with visual tools. Knowing
how to create queries will equip us to retrieve values to support the immediate
reporting requirements of our organizations' information consumers, as well as
to maintain those queries to meet changing demands, through the precise
specification of dimensional criteria.
Key Concepts and Terminology
We will begin our exploration of MDX
queries by emphasizing the concepts involved in controlling the values
that we extract from our cubes within precisely defined levels of specific
dimensions. Recall from previous lessons that a cube is a conceptually multidimensional
structure; the intersect points / intersections of the dimensions are
where data reside, in single or multiple elements called measures.
As we also discussed in previous
lessons, MDX uses a reference system involving the tuples concept to
identify and extract data, whether it be data in a single cell or a block of
cells. Tuples list dimensions and their members (which include levels)
to "address" individual cells, as well as sections of
cells, within the cube, and, because any given cell is an intersection of all
the dimensions in the cube, tuples can be used to uniquely identify
every cell in the cube. As a means of reference, measures themselves are
treated as a special sort of dimension, named Measures within Analysis
Services.
Because tuples uniquely identify
sections of the cube, based upon the dimensional intersections that define the
section's "address," they have no need to refer to any specific cell
or cells in doing so. They represent subsets of the multiple dimensions
of the cube, and provide slices that encompass more than one cell.
We will work with
ordered groups of tuples, referred to as sets, later in
the lesson. Common examples of set usage include axis dimensions (the
dimensions and members to be returned, specified in the SELECT
statement in the query, as we will see) and slicer dimensions (the
specific dimension and member criteria to which the returned data is restricted,
used by the WHERE statement, as we will discover). The axis
dimension exists to return data for multiple members, while the slicer
dimension is used to return data for a single member. The axis and
slicer dimensions work in conjunction to define, in terms of the source cube
(identified in the FROM clause of the query), the subset of dimensions
that make up the result cube, which itself can be composed of multiple
dimensions. More than
one dimension can reside on either of the two axes, as we shall soon see, and
any given dimension can exist on either axis (but never both). This fact,
coupled with the capabilities to move the dimensional members about at will
within our reports, as well as with the capability to drill down and zoom on
dimensional member levels, are what make multidimensional analysis
possible in our reporting efforts.
Understanding the
Metadata
As many of us are aware,
MSSQL Server 2000 comes equipped
with a tool called the Query Analyzer, which allows us to input a SQL
statement, execute it, and see the results from a couple of different views.
Similarly (in a simple sense), Analysis Services provides this
capability via a sample application that is installed with the typical
installation of the package. Also similar to its MSSQL Server
counterpart, the MDX Sample Application provides a Metadata pane
that outfits us with a means of interacting with cube objects visually. We can
thus easily work with MDX expressions and build MDX queries. As we progress in
our exploration of MDX queries, we will rely upon the MDX Sample Application to
make writing queries easier, as well as to help us understand the structure
of the information that our cube contains. This can be especially useful as a
means of grasping the structure of a cube with which we are not familiar, and /
or with which we had no design involvement.
In the Books Online,
which are bundled with any typical installation of MSSQL Server 2000 /
Analysis Services, or which can be accessed on the CD from which the
installation is performed, Microsoft defines metadata as information
about data, (or data about data, hence the term "metadata."
). Metadata concerns itself with data and its properties, such as data type
(text, numeric, etc.), or column sizes. Metadata also comprises information
surrounding data structures, or information concerned with the design or
makeup of objects, such as cubes, dimensions, levels, and so forth. Analysis
Services provides the MDX Sample Application to provide a vehicle from
which new users can issue MDX queries against an Analysis server, as well as to
illustrate (along with the underlying source code that is freely available) a
working example from which developers can understand how to create custom
applications for querying with MDX.
Sets
in MDX Queries
We will begin our
hands-on exposure to MDX by taking advantage of the Sample Application to
initially generate a simple query, then to move into queries that are more
sophisticated. In this way, we will gain an understanding of MDX queries "from
the ground up" in an efficient manner, expanding our grasp of both MDX and
our cube's metadata simultaneously.
Basic Query Components
Let's begin by creating a rudimentary
query using the Sample Application. We will begin by taking the following
steps:
1. Go to the Start
button on the PC, then navigate to Microsoft SQL Server --> Analysis Services, then to the MDX Sample
Application.
We are
initially greeted by the Connect dialog, shown in Illustration 1.
Illustration 1: The
Connect Dialog for the MDX Sample Application
The
illustration above depicts the name of my server, MOTHER, and properly
indicates that we will be connecting via the MSOLAP provider (the
default).
2. Click OK.
(We might also choose
to cancel the dialog box and connect later by clicking Connect on the File
menu.)
The MDX
Sample Application window appears.
3. Clear the top
area (the Query pane) of any remnants of queries that might appear.
4. Ensure that FoodMart
2000 is selected as the database name in the DB box of the toolbar.
5.
Select the Warehouse
cube in the Cube drop-down list box.
The MDX
Sample Application window should resemble that shown in Illustration 2,
complete with the information from the Warehouse cube displaying in the Metadata
tree (left section of the Metadata pane).
Illustration 2: The MDX
Sample Application Window (Compressed)
Many similarities
exist between the arrangements of objects in the Metadata pane and the Calculated
Member Builder, which we have explored in earlier lessons of this series.
We will discuss various attributes of the MDX Sample Application when they are
relevant to the exercises we undertake; it might prove highly useful to explore
the Books Online for a wealth of detail about the application.
6.
Type the
following query into the Query pane:
SELECT
FROM Warehouse
7.
Click the Run
Query button on the toolbar (the button sporting the green arrowhead-shaped
icon - a tool tip will alight when the cursor is placed upon the button to
positively identify it for us).
We see
the number 102,278.41 appear in the in the Results pane. This amount
represents the "grand total" of all measures in the Warehouse
cube, because the query we have input is simple, and specifies no
members from any dimensions within the cube. We are thus awarded the grand
total, because the query supplies the default member for each dimension;
this is standard behavior when no specifics are stipulated in the query.
8.
Save the query
by selecting File ->
Save As (the
Save Query File button on the toolbar will work equally well), and call
the file MXAS06-1.
9.
Click Save
to save the file, after navigating to a meaningful location of your choice.
Note: I typically prefer to save files
to a context-oriented directory / folder (for example, a folder I have created
for a client for whom I am writing MDX queries as a part of an engagement, or
for a class I am teaching). This is obviously a point of personal taste; our
objective is to keep track of where the queries are so that we can find
them in time of need. Much rewriting and confusion between altered versions
can be avoided by storing the queries in a logical system of some sort to keep
organized. My favorite way to do this is to create a database within which to
store the query strings, together with descriptions, author and keyword
information, along with creation / modification datetime data and "version"
information, if applicable.
Working with Sets in
Our Queries
We can
generate queries that are far more sophisticated by adding sets of
members to one or both of the axes. Let's explore this, with the following
steps:
10. Select Query
-> New from the top menu to set up a new
Query pane.
11.
Type in the
following query once again, for starters:
SELECT
FROM Warehouse
NOTE: I often intentionally add
generous space between rows, and between components of the coding, in my lessons.
This is to make the coding easier to follow.
12.
Placing the
cursor behind word SELECT, type in one space.
13.
Expand (by clicking on the "+"
sign to its left) the Set folder in the Syntax Examples list on
the right side of the Metadata pane.
14.
Select the Members-Level
function by double-clicking.
The
function appears, amid our initial query, in the Query pane, as shown in
Illustration 3.
Illustration 3: Building
the Query: Inserting a Set Function
15.
Click the <<Level>>
token once to highlight it.
16.
Expand the Store
dimension in the Metadata tree to expose its member levels.
17.
Double-click
the Store State level, within the expanded
Stores dimension.
18.
Type another
space just behind the .Members function of the top row.
19.
Type the words
ON ROWS.
Now we
will arrange the query a bit, as we build it further.
20.
Click to the left
of [Store], just after SELECT.
21.
Press the ENTER
key.
22.
Place the
cursor behind the SELECT statement.
23.
Press the ENTER
key again, to create a blank row under SELECT.
24.
Expand the Set
folder in the Syntax Examples list, once again, if necessary.
25.
Double-click the
Children function.
The
function populates the newly created row.
At
this stage, the statement should duplicate the following:
SELECT
<<Member>>.Children
[Store].[Store State].Members ON ROWS
FROM Warehouse
26.
Select the <<Member>>
token with a single-click.
27.
Double-click
the Store Type dimension in the Metadata tree to select it into
the <<Member>> token.
Our
statement should now appear as below:
SELECT
[Store Type].Children
[Store].[Store State].Members ON ROWS
FROM Warehouse
28.
Type in a
space, then the words ON COLUMNS after Children in the above
statement.
29.
Add a comma (",")
immediately after the word COLUMNS.
Our
statement now appears as follows:
SELECT
[Store Type].Children ON COLUMNS,
[Store].[Store State].Members ON ROWS
FROM Warehouse
The ON
COLUMNS addition specifies that the second set be added to the column
axis. Anytime that we create multiple axes, as we are doing within this
step of our example, we are required to designate the respective axis with
which each such string is associated. Order in the query is not as important
as the fact that the designation itself is made. (A single axis need not be
defined, as it defaults to Column axis, similarly to the case we observed in
the first query in this lesson).
30.
Select Query
--> Run from the main menu.
We obtain
the results shown in Illustration 4 below.
Illustration 4: The
Results of the Two-Axis Query
31.
Save the query
as MXAS06-2.MDX.
In
addition to using other members' functions (explore these in the Set
folder of the Syntax Examples list as you find time), we can make a
simple alteration to our query statement to cause it to return the entire
membership of the Store dimension.
32.
Remove "[Store State]." from the query (including
the "." delimiter that follows the right bracket) from the query we have designed.
Our
modified statement appears in the Query pane as shown in
Illustration 5. (I have created a composite picture, removing many of the
empty rows, to conserve space.)
Illustration 5: The
Modified Query
33.
Click the Run
Query button to execute the query.
We obtain
the results partially shown in Illustration 6 (scrolling to the
appropriate rows):
Illustration 6: The
Store Dimension's Entire Population is Retrieved (Composite View)
34.
Save the query as MXAS06-3.MDX.
Specifying Members
in a Set
As we see from the
fruits of our labor so far, we can assign sets to axes with relatively
little programming effort. This is particularly true when we can rely upon
many of the default member settings to "fill in the blanks," with regard
to the requirement to supply "addresses" for the values we seek to
retrieve, in terms of all dimensions in the cube. However, to approach the content
of a report that leverages multi-dimensional concepts to produce denser
information in a useful form, we must go further than this. We must be able to
specify individual members, for purposes of extracting precise
information from the wealth of cube data, as well as to be capable of combining
multiple dimensions on a single axis. MDX rises to the occasion, and
supports our need for precise control of the presentation quite well.
Let's explore meeting these
further requirements, with the following steps:
35.
Start a new query using the New
Query button.
36.
Type the following into the Query
pane:
SELECT
{[Measures].[Warehouse Cost]} ON COLUMNS
FROM Warehouse
We
enclose the set we want to retrieve in braces ({}). The
braces inform the application that the enclosed member(s) represent a set.
The
results we obtain should be a single "grand total" for Warehouse
Cost, and should duplicate those shown in Illustration 7.
Illustration 7: the
Grand Total of the Store Cost Measure for the Cube
The focus here is the
selection of an individual measure for placement on the column axis. Our
enclosing the measure in braces defines its status as a set, as we have
noted, which is a requirement for anything that is placed on an axis within
that status.
Now let's say that our
business requirement changes: we are asked to display not only the Warehouse
Cost value, but two additional values, Warehouse Sales and Units
Shipped. We will deliver these tuples with the following enhancement to
our query.
37.
Add the two
additional measures, separated by commas and spaces, to the query as follows:
SELECT
{[Measures].[Warehouse Cost],[Measures].[Warehouse Sales],
[Measures].[Units Shipped]} ON COLUMNS
FROM Warehouse
38.
Click the Run
Query button.
The
results appear as shown in Illustration 8.
Illustration 8: Three
Tuples Yield Three Measure Columns
Our set
now consists of three tuples. We have thus presented a multiple member set,
identifying those members specifically in our query.
Using
the CrossJoin Function to Nest Multidimensional Sets
Now, let's
nest multiple dimensions into the row axis, to demonstrate a useful approach to
meeting business needs in this way, if required. We can do this with the CrossJoin()
function.
39.
Modify the
query to match the following:
SELECT
{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales],
[Measures].[Units Shipped]} ON COLUMNS,
[Store].[Store State].Members ON ROWS
FROM Warehouse
Here
we are simply juxtaposing our last statement, where we are specifying the
multiple member set for the column axis, with a statement using a set
function (.Members) to specify the row axis. Because we are
using a set function, note that we can leave off the braces.
40.
Click the Run
Query button.
The
results appear as shown in Illustration 9.
Illustration 9: The
Result Set, Specifying Both Axes
Now, let's say we need
to nest two sets in the row axis set. We will do this with the CrossJoin()
function we discussed above.
41.
Modify the
query to add the CrossJoin() function, just before the ON ROWS
instruction, as shown below (the CrossJoin() function appears in blue):
SELECT
{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales],
[Measures].[Units Shipped]} ON COLUMNS,
CrossJoin([Store].[Store State].Members, [Product].[Product Family].Members) ON ROWS
FROM Warehouse
42.
Click the Run
Query button to view the effect of the changes (scroll to the USA
Store State rows).
The
results appear as partially depicted in Illustration 10.
Illustration 10: The
Result Set, With a Little Help from the CrossJoin Function
The CrossJoin()
function has enabled us to juxtapose two sets, from two dimensions, as
if it were one combined set, on the row axis. This produces the nesting
of the Product Family level members into the Store State levels
that we see above, and gives us a practical preview of the power of the CrossJoin()
function.
Now,
let's go one step further for the sake of finesse, and remove the many empty
spaces in our new report. This is a simple matter of replacing the CrossJoin
function with a derivative function, the NonEmptyCrossJoin() function,
as shown below (I have highlighted the only change over the last query in
blue):
SELECT
{[Measures].[Warehouse Cost] , [Measures].[Warehouse Sales],
[Measures].[Units Shipped]} ON COLUMNS,
NonEmptyCrossJoin([Store].[Store State].Members, [Product].[Product Family].Members) ON ROWS
FROM Warehouse
43.
Modify the
query to reflect the change in the function name noted above.
44.
Click the Run
Query button.
The results
appear as depicted in Illustration 11.
45.
Save the query as MXAS06-4.MDX,
and exit the Sample Application as desired.
We now have a report,
albeit simple in structure that hints at the potential available to us within
MDX for multidimensional analysis. We have used the CrossJoin()
function to create a report that allows us to simultaneously analyze multiple
dimensions, as well as to nest two of these dimensions on one axis (in our case
the Row axis) as part of the objective.
Next in Our Series ...
In
this lesson, we began
an exploration of MDX queries. We discussed the differences between MDX
expressions and MDX query statements, and as part of
this overview, discussed key MDX concepts and terminology. We followed
our overview with an exploration of MDX query building from the ground
up, using the MDX Sample Application as a vehicle for crafting our statements
and practicing their use.
We
delved into set functions, and the creation and use of sets,
discussing their importance in our MDX queries. We progressively built the
specification of members, and the combination of multiple dimensions, into our
row and column axes, to add truly multidimensional capabilities to the reports
we produce for ourselves and for the information consumers we support.
Our
next topic will span a three-article set. In these three lessons, we will
focus on building and using calculated members within our queries to meet
various business needs. We will begin by previewing the creation of dynamic
calculated members to set the stage for the functionality and processes we will
explore together.
Throughout the Building Calculated Members articles,
we will explore and practice the creation of increasingly sophisticated calculated
members, discussing various types and their uses. We will further transit into
the realm of using functions within our calculated members, practicing different
uses for the components we have explored. Finally, we will preview member
functions, in preparation for lessons to follow Building Calculated
Members.
»
See All Articles by Columnist William E. Pearson, III