% fortune -ae paul murphy

"We're working on it"

Here's a bit from bportlock's response, last week, to my view that mainframe and wintel style server consolidation is better done without OS virtualization by shifting some services to Unix and running the remaining applications two or more to a server:

No No No!!!!

Businesses don't make these decisions on technical issues. They make a decision based on market or financial issues and then tell the techies to do it.

What they WON'T stomach is the techies coming back with a plan to cause massive disruption by changing applications just so they can save money on servers or play with the latest "technical toys"

Re-architecting, reimplementing, retraining, disruption costs, etc etc will probably cost more than the savings on the Sun method. OTOH, virtualisation gives the tech dept a way to say "we can cut costs and cause NO disruption".

How many microseconds do you think it will take to make the business decision - in favour of virtualisation?

What he's articulating here is probably the most common larger data center argument against switching anything to Linux: specifically, that the status quo needs to be continued because doing anything else places the data center's ability to serve its users at risk.

The problem with this argument is that it's both a self fulfilling prophecy and wrong -wrong because this is an alligators and swamps thing, and self-fulfilling because letting the people who're responsible for the status quo block most change while managing anything that does get approved pretty much guarantees continuing failure.

On the swamps side the reality is that most larger data centers exist just on the precarious edge of continuous widespread failure - every installation I know of with more than even a few hundred PCs and servers has something important to someone down every single day, and the larger the system the more predictably management lives in a continual state of functional and budget crisis.

In that situation small technology changes look like (and invoke) big risks, and the use of crisises to extort budget and staffing increases from some management committee makes it almost impossible to get funding for strategic change - creating a situation in which a new worm can force a large data center to request and spend millions on emergency Windows upgrades, but a strategically focused CIO cannot set aside sixty to eighty thousand to prove the effectiveness of changing to non Wintel hardware and staffing.

In that context bportlock is quite right - the worse the situation is, the more likely it is that people from IT and user management will convince each other that continuing cost escalation in a losing battle offers lower risk than attempting to drain the swamp.

Basically, the more active the alligators, the more impediments management committees tend to throw in front of efforts to get rid of, rather than patch over, the problems - and generally the more willing they become to throw good money after bad.

In finance this is called the sunk cost fallacy; and it's equally common, but no more logical, in IT.

So what can you do? If you're an IT manager trapped between escalating costs and failures, server style virtualization can help you be seen responding in politically correct ways: reducing your server counts, and thus space and energy use, at the cost of decreasing user services -with the bonus that the worse your network and storage implementation is, the less effect the users will suffer from increasing server CPU commitment.

Notice that if you can improve x86 server utilization from a few percent to something greater than about 35-40% without users complaining about the impact on response time, the lesson isn't that your machines were under-utilized, but that at least one of your network or storage operations is a shambles.

But there's a hidden problem: once you've done virtualization, where do you go? You've increased complexity, increased per server risk, increased your people side risk, and decreased both user services and system flexibility - all to get into a box with no way out except through replication of what you had before.

The better answer is to reduce complexity and risk while improving performance - look at the whole picture in terms of delivered services at the user desktop to clean up networking and storage while migrating some services to Linux or Solaris and consolidating your remaining Wintel applications to a few over powered machines with carefully thought through and provisioned network and storage access.

The trick, of course, is doing it without causing a panic among either your own staff or user management - and the right answer for that is well known: a skunk works established outside your data center lets you hire non Wintel staff, experiment with Unix software, and get everything ready for production use. Then when you switch users to the new gear they won't notice a thing - and because the improved performance and reliability you get from Unix will mean that you never hear from them again, your swamp will gradually get smaller and smaller and until it's obvious that most of the remaining alligators are on your staff.


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.