About the Series ...
This is the twelfth tutorial article of the series, MDX
in Analysis Services. The series is designed to provide hands-on
application of the fundamentals of MDX from the perspective of MS SQL Server
2000 Analysis Services ("Analysis Services,");
our primary focus is the manipulation of multidimensional data sources, using
MDX expressions, in a variety of scenarios designed to meet real-world business
intelligence needs.
For more
information on the series, as well as the hardware / software requirements to
prepare for the tutorials we will undertake, please see the first lesson of
this series: MDX
Concepts and Navigation.
Note: At the time of writing, Service
Pack 3 updates
are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis
Services, and the related Books Online and Samples.
The screen shots that appear in this article were taken from a Windows 2003
Server, and may appear somewhat different from coinciding views in other
operating systems.
Introduction
In our last tutorial, More
on Location, and the Importance of Arrangement, we returned to our
three-part mini-series, Optimizing MDX. We continued our focus from the
first article of the series, Control
Location of Processing, exploring the use of control of
location as a primary intervention type for MDX query optimization. We
performed a practice exercise to reinforce the concepts exposed, and then
extended our considerations of additional types of intervention to include the optimization
of set operations and syntax arrangement considerations. Within our
exploration of the optimization of set operations, we undertook practice
examples that illustrated some ways we can rearrange queries to enhance
performance, often significantly.
In this lesson, the
final article of the current Optimizing MDX mini-series, we will
expose methods of caching to load a commonly used slice of a cube into
memory, making for faster retrieval in prospective operations. Our discussion
will include various aspects of cache creation, and uses of caching within MDX.
In addition, we will touch upon other performance enhancement options,
including external functions and cube design modifications and
augmentation.
Caching and Optimization
When queries
are not well underpinned by aggregates, we can often enhance performance by
creating and caching the appropriate aggregates in memory. Caching is a feature
that MDX provides to improve performance; caching affords us the capability of
loading a commonly used slice of a cube into memory, "caching" it for
faster retrieval by our queries.
Analysis Services and the
PivotTable Service automatically cache query definitions, data and meta data on
the server and client sides, respectively. Caching increases performance in
those cases where queries are repeatedly requesting the same data or meta data,
reducing network traffic or execution time. The ability to create caches for
data that we specify in MDX gives us another means of fine-tuning query
performance; through this capability, we realize a great degree of control over
the caching of data for which we expect there to be a recurring need.
In terms of creation
scope, caches are analogous to named sets: we can create a cache for
the lifetime of a single query, or for a session. To create a cache to be used
at the session level, the CREATE CACHE statement can be used. The
CREATE CACHE statement can be used to create caches at the query level,
but the WITH statement, with which we are now somewhat familiar, can
perform this task just as easily, and is more frequently used for this purpose.
Let's take a look at the
use of the WITH statement to create a cache in an MDX query. First, we
will call our old friend, the MDX Sample Application, as a platform from
which to perform our practice exercises.
1.
Start the MDX
Sample Application.
We are
initially greeted by the Connect dialog, shown in Illustration 1.
Illustration 1:
The Connect Dialog for the MDX Sample Application
The
illustration above depicts the name of my server, MOTHER1, and properly
indicates that we will be connecting via the MSOLAP provider (the
default).
The MDX
Sample Application window appears.
3.
Clear the top
area (the Query pane) of any remnants of queries that might appear.
4.
Ensure that FoodMart
2000 is selected as the database name in the DB box of the toolbar.
5.
Select the Warehouse
cube in the Cube drop-down list box.
The MDX
Sample Application window should resemble that depicted in Illustration 2,
complete with the information from the Warehouse cube displaying in the Metadata
tree (left section of the Metadata pane).
Illustration 2: The MDX Sample Application Window
(Compressed View)
Let's look first at
creating a cache with a query scope. To do so, we will take the
following steps:
1.
Create the
following new query:
-- MXAS12-1: WITH CACHE Query
WITH CACHE AS
'([Product].[Product Department].Members)'
SELECT
{[Measures].[Warehouse Sales]} ON COLUMNS,
{[Product].[Product Department].Members}
ON ROWS
FROM Warehouse
2.
Execute the
query using the Run Query button.
The
results dataset appears as shown in Illustration 3.
Illustration 3: WITH
CACHE Query
3.
Save the query
as MXAS12-1.
Keep in mind that the "life"
of the WITH CACHE statement is only as long as the query in which it
resides. Use of the WITH CACHE statement can sometimes result in more
rapid completion of the overall query, because the full set of cells that we
have specified in the statement arrives at the client before the
multidimensional data set is returned. In scenarios where the server is
accessed via a high speed LAN by the application generating the query, the
small performance enhancement may be negligible. By contrast, scenarios where
access is over WAN links or modem connections, query results may be retarded in
first making an appearance, but will likely require less time to retrieve
overall.
Now, let's take a look at
the use of the CREATE statement to create a cache in an MDX query. We
will create a cache with session scope, as we have already created a
cache with query scope above.
4.
Create the
following new query:
-- MXAS12-2: CREATE CACHE Query
CREATE SESSION CACHE FOR Warehouse AS
'(
Descendants (
[Time].[1998],
[Time].[Month]
),
[Warehouse].[Country].Members
)'
5.
Execute the
query using the Run Query button.
6.
Save the query
as MXAS12-2.
The query creates a cache with session scope; we
notice that no measures are specified this time. This is because all the cube's
base measures are loaded into the cache at runtime. The CREATE
statement above does not take noticeably more time to execute than would the
core query; immediate execution of the query will occur subsequent to cache
creation, however, because the query would process in its completeness from
RAM, where the cache is housed.
Before
we decide to use caching, we need give thought to whether the query we are
attempting to improve through caching will actually benefit from means other
than redesign of the query itself, and whether the caching process can
realistically provide improved performance in general. Obviously, a one-time
query is not likely to benefit from caching. In addition, numerous situations
will exist where, although a large population of cells are specified in a
query, only a few cells are actually accessed. To use caching, for example, in
a scenario where only specific tuples out of a large CrossJoin are
actually used, may mean more processing time to cache the population of cells
than will be saved in performance gains for the few cells actually used. The
cache statements are best left to scenarios where their effects are likely to
increase performance and where the query actually has a need for tuning from
the outset.
7.
Close the Sample
Application.
Other Performance Enhancement Options
External
functions can often offer processing optimizations over their calculated
equivalents, particularly when the calculated members are complex. External
functions can be used in a query via the presence of the associated calls to
those functions, if
the associated function library is installed in the appropriate place(s) on the
client or server. Queries making use of these functions can be resolved on
either the client or the server if the function resides on either tier, and, as
long as the external query is not used in making the determination of the axes
within the query, then the query can be largely resolved on the server, even
when the function does not reside on the server. Keep in mind, however, that
the presence of the function solely on the client means forced client-based
processing regardless of other factors.
Finally, cube
design modification and / or augmentation can provide significant efficiency in
cube processing, even though this might often rest outside the control of the
query designer. Such improvements as placing member properties into measures
(especially numerical data) allow for free and efficient use of these fields at
multiple levels. And the additional processing that can be encountered by
making this measure available as a calculated member can be mitigated
significantly by placing the measure in a "custom" cube
(one-dimensional cubes work fine, if adequate to meet the need), which can then
be combined with other cubes - cubes that need the new measure - in virtual
cubes. This concept can be extended into many other areas, and provides an
excellent way to leverage the existing cube structure by "adding on"
needed components.
Summary and Conclusion ...
In
this, the concluding lesson of our Optimizing MDX mini-series, we
extended our toolset by adding yet another intervention type for optimizing our
MDX queries. We exposed two methods of caching to load a commonly used
slice of a cube into memory, making for faster retrieval in prospective
operations. We discussed the creation scope for caches, within the
context of both the CREATE CACHE and the WITH CACHE statements,
and discussed appropriate uses for each. Finally, we touched upon other
performance enhancement options, including external functions and cube design
modifications and augmentation.
While
this brings our Optimizing MDX mini-series to a close for the
present, we will likely add new segments to the set from time to time, on an ad
hoc basis. Optimization is obviously an area of great interest to developers
and consumers alike, and it is a rare week indeed when yet another nuance for
more efficient MDX does not present itself at client sites and / or in my lab.
Stay tuned for more tips in the months to come!
»
See All Articles by Columnist William E. Pearson, III