Benchmarking
Systems
Henry Newman
This month, I will briefly cover benchmarking systems. I think it is important
to discuss benchmarking before covering RAID internals, backup/restore, tapes,
hierarchical storage management, and all the other parts of a system, including
architecture for high availability. Even if you do not require a benchmark,
understanding the process is an important part of the storage decision process.
So this month, I will cover the benchmarking process as an introduction to the
storage architecture and storage decision process.
Defining a good benchmark that is fair to the vendors and, more importantly,
fair to the purchaser is very difficult. Detailing the process of developing
specific benchmarking rules or determining evaluation criteria is impossible
here because so much depends on your own operating environment. My goal in this
column is therefore to provide an overview of the benchmarking process and provide
some general guidelines and suggestions on how to make the process fair for
both parties (the vendor and the purchaser).
What Is Fair
Vendors generally do not want to benchmark servers, and they want to benchmark
storage even less. The cost of benchmarking storage is very expensive, involving
not only the costs of the server, HBAs, FC switch, and RAID, but also the software
expense of the file system and/or volume manager. If you couple that with the
fact that you need a good applications analyst, a system tuner, a RAID guru,
someone to write a report, and a project manager, you end up with a huge cost
to the vendor.
On the other hand, if you are planning to purchase around or more than $400,000
worth of storage equipment, you really need to know whether any given vendor
with any given product can meet your requirements. However, you cannot expect
a vendor to run a benchmark that costs them $40,000 for a system that costs
$500,000. A good rule of thumb is that the benchmark cost should be less than
2% of the total system value. You might be able to get that to 5%, but remember
the vendor will recover the cost of the benchmark one way or another.
Requirements
Analyzing your requirements is key to developing a benchmark that meets your
needs. If you are benchmarking storage, you need to look at your requirements,
such as:
1. Space
2. Backup recovery
3. Performance
4. Reliability and repair time
5. Application usage
6. File system being used
Benchmarking storage is hard because there are so many levels of indirection.
These include:
1. The application I/O request
2. What the system does to that request
3. How the file system and/or volume manager handle the request
4. What happens in the HBA with the queue of requests
5. If a switch is involved, you have issues with latency performance and Fibre
Channel credits needed to set up a write
6. RAID controller command queue
7. RAID controller cache and caching algorithm
8. I/O performance for the controller and I/O performance of the disk drives
and the disk cache
This is not a straightforward process; hence, to run a benchmark correctly,
the vendor will incur a high cost. You must carefully examine your requirements
before you set up a benchmark and use these requirements to develop the specification.
Benchmarking Rules
Development of benchmarking rules is just plain hard work. Remember that the
goal of every benchmarker is to win the benchmark. In some cases, a good portion
of the benchmarker's salary is contingent upon the results of the benchmark.
Therefore, experienced benchmarkers will read your rules and look for advantages
and loopholes. They may follow the rules to the letter but not necessarily follow
the spirit. Some slimy benchmarking tricks include creating file systems with
allocation equal to the file sizes in the benchmark to optimize the allocation.
Other tricks are equally unethical, such as placement of the file system in
memory, and booting from an SSD to speed UNIX pipes for an interactive benchmark.
In such cases, the rules are followed, but the customers do not get what they
wanted -- they merely get what they asked for.
My point in using these examples is to show that documenting what you really
want and developing detailed rules are critical to getting an accurate benchmark.
As a reformed benchmarker, I now spend time writing benchmarking rules. I am
always amazed at how creatively benchmarkers read these rules. For a recent
benchmark for a government organization, a detailed set of rules for running
the benchmarks totaled more than 100 pages. Obviously, this is very time-consuming
and is not very cost effective unless you have regular large procurements. The
government organization in question has a yearly benchmark process to purchase
around $40 million in new hardware.
The benchmarking that you will be involved in will very likely include databases.
Databases present many additional problems in regard to developing benchmark
specifications. How and where the tables, indexes, redo logs, and even the database
application reside become critical issues. The absolute most important part
of any benchmark or any testing matches reality. Does the benchmark match the
reality of your operational environment? Operational issues like backup and
recovery must be considered within the design of a benchmark. For example, if
you are purchasing storage, the vendor must benchmark the database on the server
that you use.
Benchmarking Evaluation Process
The best way to ensure that the process is fair is to attend the final benchmark
results and witness the running of your operational environment. Thus, you can
certify the results and ensure the vendor does not try something that you missed
in the rules. Most of the time, visiting each vendor is impossible given the
costs and the time. Often, you simply provide the instructions to the vendor,
wait for the vendor presentation of the results, then decide on a winner based
on the decision criteria.
Ah, what about the criteria? The decision or evaluation criteria are absolutely
critical to this process. The benchmark team must agree on the evaluation criteria
before the benchmark is released. Everyone has his own favorite vendor -- from
accounting to the system manager to the operations manager to you. Agreeing
upon written evaluation criteria is the only way to be fair to the vendors and
eliminate fighting within the organization after the results are delivered.
I am not suggesting that the vendors be given the evaluation criteria, but I
strongly recommend that you have internal agreement on the evaluation criteria.
So, what should be the criteria for evaluation? Back in the mid-90s there
was a saying about RAID purchases. You could have it Fast, you can have it Reliable,
or you can have it Cheap -- pick any two. Although RAID is a bit different now,
these same criteria can apply to evaluation of any system or set of hardware
components. Determining your evaluation criteria is a balancing act unless you
have an unlimited budget. You cannot get 99.999% uptime (also known as five
9s, or about 2 minutes of downtime per year) for the same price as 99.9% (about
8 hours and 45 minutes of downtime per year) and have the same performance.
You cannot buy the system to support five 9s of uptime for even twice the price
of three 9s and have the same performance. There must be some tradeoffs in terms
of performance, reliability, and price, and you must determine which criteria
are most important for your specific environment.
Acceptance Planning
Development of an acceptance plan should also be part of the benchmarking
process. Vendors should be informed of the criteria for acceptance before they
submit the final proposals (Best and Final Offerings or BAFO). The acceptance
plan should therefore be agreed upon by internal interested parties. The acceptance
plan should cover integrating the hardware and/or software into your operational
environment and ensure the performance and reliability that the environment
requires.
Too often, the hardware arrives on the dock and is powered up and the accounting
department cuts a check. This does not benefit either the vendor or the customer.
For the vendor, the acceptance criteria should ensure that the equipment is
delivered, works as expected, and is configured to meet the benchmark criteria.
For the customer, acceptance criteria should ensure that you get to see the
benchmark performance, see how to configure the system, understand the configuration
options, and therefore get some free training on the use of the system.
Having an acceptance process provides mutual benefit and should be part of
the procurement process. When acceptance criteria are clearly defined and an
acceptance process is required, the new hardware is more quickly and effectively
integrated into the environment. This reduces the real, total cost of the hardware
or software.
Conclusions
Developing and following a formal process for acquisition, evaluation, and
acceptance of new hardware provides significant benefit to both vendors and
customers. Often you may just buy a single RAID controller or single HBA to
enhance the system you already have, but that should not preclude you from thinking
about newer models or other vendors. Given how quickly technology changes, buying
older stuff just because it is the same with regard to interoperability may
not make sense; you may realize greater performance/price gains with newer,
more advanced components.
In my next column, I will begin covering RAID. This is a complex topic that
involves many different areas. My goal is to demystify RAID.
Henry Newman has worked in the IT industry for more than 20 years. Originally
at Cray Research and now with a consulting organization, he has provided expertise
in systems architecture and performance analysis to customers in government,
scientific research, and industry around the world. His focus is high-performance
computing, storage and networking for UNIX systems, and he previously authored
a monthly column about storage for Server/Workstation Expert magazine.
He may be reached at: hsn@hsnewman.com.