Draft Blog Entries

% fortune -ae paul murphy

Storage choices

In yesterday's blog I noted a test done for a Sun customer comparing 24 UltraSPARC 5220s against 250 Dell 2950s:

Twenty-four Sun SPARC Enterprise T5220 servers (UltraSPARC T2 processors) were able to consolidate a customer's workload of MySQL databases which were running on over 250 Dell 2950 servers. For this medium-weight OLTP customer workload, the solution was architected to handle query distribution at the application layer. In each of the over 700 instances of MySQL, up to a 4 GB database was used.

Doing it the customer's way - keeping the older java run-times and the 700 small DB instances -makes perfect sense in the context of Sun's need to make the sale and the customer's interest in getting a comparison based only on changing the hardware. It's not, however, where the effort should have stopped.

Specifically, this test was done before Sun's new storage gear became available, without the benefit of the CMT optimized "web2 stack", and without consolidating those 700 DB instances.

We don't know on what basis the data was divided into those 700 instances but there are basically four options:

sharding based on index ranges with, and without, duplication; and,
sharding based on table groups (schemas) with, and without, duplication.

As a first step this can be reduced to two simply by eliminating any duplication - and then the second step would be to enlarge either the index ranges or table groups as the case may be to reduce the number of database schemas to a much smaller number.

What the right number is depends jointly on workload and on the hardware approach selected. On the hardware side there are two main options:

use four core T5440 gear with locally dedicated JBODS; or,
use dual core T5440 gear with second generation "thumper" storage like the Sun 74XX series.

Both approachs rely on ZFS to manage the file space and both require only minor change to the applications's query distribution logic. The difference is really that the 74XX storage servers are shared devices enabling database integration all the way down to one instance if that makes sense for the application, where the JBOD approach can require one shard per processor and, presumably, extra 10GB cabling between the processors to facilitate multi-point queries.

On the software side the limitations come from "vertical" scaling limits within the InnoDB component in MySQL. Thus "Mr. Benchmark's" MySQL scaling tests on a T5440 show the per instance throughput peaking at about 735 TPS (amounting to about 4,350 SQL statements per second) and you'd want to limit your instance consolidation efforts to reach no more than about 60% of the applicable saturation level during 98% or more of processing time.

Once you pick the hardware architecture and number of instances you can choose the right amount of RAM and flash memory to minimize read/write latencies for the application. As far as I know there's no right way to calculate how much ZFS cache you need short of trial and error but, on the "bright side" this is made easier by the fact that you buy the flash in minimal 18GB increments.

The point of this, of course, is simplification: going from 250 Dells to 24 T5220s is a big step in that direction, but going from 700 DB instances to perhaps as few as four running on two 7420 storage servers and eight T5440s is really a much bigger one both in terms of the administrative resources you need to to commit to normal operations and in terms of risk containment.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.