Draft Blog Entries

% fortune -ae paul murphy

What mainframe Linux costs

Here's the opening bit from a widely quoted network world report by "layer8" on IBM's recently announced plan to save $250 million by replacing 4,000 small servers with 30 mainframes:

Talk about eating your own dog food. IBM today will announce it is consolidating nearly 4,000 small computer servers in six locations onto about 30 refrigerator-sized mainframes running Linux saving $250 million in the process.

A related article by John Fontana gives more, although somewhat different, details:

The company will deploy 30 System z9 mainframes running Linux within six data centers to replace 3,900 servers, which will be recycled by IBM Global Asset Recovery Services.
The data centers are located in Poughkeepsie, N.Y., Southbury, Conn., Boulder, Colo., Portsmouth, UK; Osaka, Japan, and Sydney, Australia.
The company is focused mainly on moving workloads generated by WebSphere, SAP and DB2, but will also shift some of its Lotus Notes infrastructure.
The mainframe?s z/VM virtualization technology will play a big role in dividing up resources, including processing cycles, networking, storage and memory. With z/VM 5.3, IBM can host hundreds of instances of Linux on a single processor. The z9?s Hipersockets technology, a sort of virtual Ethernet, will support communication between virtual servers on a single mainframe. IBM also will take advantage of logical partitioning, which is rated at Level 5, the highest security ranking on the Common Criteria?s Evaluation Assurance Level (EAL).
IBM says energy costs represent the bulk of $250 million in expected savings over five years.

Now, before we look at substantive issues here it's important to note that $250 million divided by 3,900 is $64,102 per server -meaning that this story embeds the prediction that energy cost will go up by a factor of more than seven sometime later this year.

Note too that a Dell 860 server with a dual core Pentium D at 3.2Ghz, an 80GB disk, and 2GB of memory lists for $1,060 while a 26Mips z9 (about 120 x86 Mhz) with seven attached 1.65Ghz Power5 linux processors each with 16GB, lists at about $850,000 with SuSe 9.0. The cost, therefore, of the 30 z9s mentioned will exceed that for 3,900 new Dells by just about exactly five times or $20,000,000 - and that's before additional IBM licensing and storage.

This story, in other words, is as obviously bogus as the two mentioned yesterday.

What's important here, again as noted yesterday, is that the consolidation idea is driven by the use of system utilization as a measure of management success -and that produces an effect directly opposite to what users want: because users want services on demand, and the higher the system utilization, the lower the probability that the resources users want will be available when the user wants them.

We don't have configuration details, but if we imagine that the 3,900 servers are all low end, 3Ghz, x86 uni-processors while the z9s all have the maximum seven Linux "engines", we'd read this announcement as promising to replace 11,700 x86 Ghz with 346 PPC Ghz - a 97% reduction in available machine cycles for a workload described as consisting mainly of known resource hogs: Websphere, Notes, SAP, and DB2.

We do not know what the service times or request frequencies for these look like, but we can make assumptions about this, do the arithmetic, and then see what that tells us about how well this is likely to work.

The 7 IFLs attached to z9 offer a total of 11.55 PPC Ghz - at the usual 2:1 PPC advantage, this is roughly equivelent to 24 x86 Ghz. 130 single core x86 servers at 3Ghz offer 390 x86 Ghx. In other words, if mainframe ghosting imposed zero overhead and everything ran from directly connected ram disks, than the maximum x86 average utilization the z9 could handle would be just under 6%.

Notice, however, that this is not a linear function - the 6% noted above is a local maximum for a function determined by the relative task completion potential for the two processors and the workload thrown at them. If, for example, the x86 workload consisted of one request per server per minute and that request took an average of 0.6 seconds to service you'd have a 1% x86 utilization rate - but the mainframe would need 62 seconds per minute to keep up with the workload.

In practice, of course, overheads are a killer because there isn't enough memory or network bandwidth available keep all 130 ghosts alive concurrently.

The bottom line is simple: to the average user going from a a system with 11,700 x86 Ghz to one with 346 PPC Ghz will be almost exactly like going from a 3000mhz x86 machine to one running at 180mhz - and what that means is not only will the data center never get close to break even on capital cost, but the organization as a whole will incur significant losses because users expect better and will adapt, usually by some combination of drunken sailor spending on Wintel to bypass data center delays and/or by slowing whatever they do to match the mainframe's pace.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.