NOTE: this is a draft of a forthcoming Linuxworld.com article. Please do not republish without permission. Comments, of course, are welcome.

Sun shine, Sun bright
This wish I make tonight

Sun made a tiny profit on operations during the quarter ending Dec 29, 2002 but posted its

Sunny side up? or over heavily?

Open Magazine recently ran a story by editor Nancy Cohen under the headline: SUN SWATS which she summarized as:

Left to interpret the order of the universe, IBM would have you believe any IT decision maker with the least bit of common sense would choose to migrate from Solaris to Linux any day: And they have the SWAT team to do it.

That article got me thinking about Sun's future and the strategies needed for the company to survive and grow over the next few years.

It's not obvious that Sun will survive the next few years. Today's market cap for the company, about $9.3 billion based on the December 19 closing price of $3.00 per share, is up about a fifth from several weeks ago but still rather less than the liquidation value of the company for someone willing to default on a number of outstanding contractual commitments. Microsoft, IBM, and HP can't make a takeover bid for anti-trust reasons, but all of those companies know people who know people -including corporate raiders whose arms length activities could benefit both parties.

Supposing this threat can be kept at bay through the legal system, the question is: can Sun continue to make it on its own?

IBM swat teams aren't a problem. This is normal competitive behavior aimed mainly at influencing HP-UX and VMS users looking for a new home and not unusual in the industry. Neither are dishonest TCO studies that do things like compare the cost of a five year old Sun box running Oracle to a new Dell machine running mySQL. These things only work as reinforcers for prior beliefs because those who want to believe will avoid the half second worth of thought that it takes to realize how specious and self-serving the comparisons are.

Both kinds of things do, however, represent attempts by IBM and others to take advantage of weaknesses in Sun's market position and it is those weaknesses, not competitor efforts to take advantage of them, that Sun has to address.

What I see as Sun's three biggest weaknesses are:

  1. Sun's top management seems to have lost both focus and control. The original sense of mission has been displaced by the pressure to meet quarterly earnings forecasts while middle and sales management seems to be developing the combination of opportunism and arrogance that helped doom Digital.

  2. Most of Sun's customers don't have any idea how to use Unix effectively - and Sun isn't helping them figure it out. This became a problem as soon as the first SPARC Servers moved out of the hands of engineers and academics and into business roles. That was crucial to the company's growth, but it put Unix into the hands of people who simply didn't have the background to use it properly and whose attempts to treat Unix as just a cheaper mainframe or mini drove up costs and complexity while driving down performance. (See my Unix Guide to Defenestration for an extended discussion and remediation strategies.)

    This problem is now becoming a crisis as people who bought large Sun configurations in 98 and 99 scuttle back to IBM while bad mouthing Sun and Unix as the cause of their failure to make those systems work effectively.

  3. It's reasonably clear where both Solaris and SPARC are headed: to Plan 9 and the power to support it. What's missing is any sense of a Sun commercial vision on how that's going to be migrated from the lab to the user's working environment. Right now Sun's hardware and much of their software looks out of step with the market and, absent some compelling reasons to the contrary, that disconnect is going to lead to a significant sales drop next year. Those reasons exist, but Sun isn't telling anyone - and competitor noise like phony TCO studies or inapplicable benchmarks combine unhappily with the confused N1/Blade server message Sun is currently putting out to significantly raise the barriers Sun will have to overcome when it does figure out what it wants to say.

These three issues are, of course, very closely related and may well just be three expressions of the same problem: management's substitution of short term "bottom line" thinking for long term strategic focus.

 Sun 6800Sun V880Dell 6600
Number of Units 1 4 8
Total RAM 128GB 128GB 128GB
Total Disk 2 x 5TB 4 x 2.5TB 8 x 1.25TB
Total CPUs 24 32 32
Total GHz 252 336 640
Operating System Solaris Solaris Linux
Approximate list price $1.8 Million $920,000 $432,000
Approximate negotiated price $1.3 Million $760,000 $432,000

Consider, for example, these three systems configurations and ask yourself what the extra million or so for the Sun 6800 buys.

Now, obviously, if your thought is that it buys the ability to partition - so you can run the 6800 as if it were three V880s- you'd be illustrating point two, above, and what that should buy you is a pink slip. Partitioning made sense on the System 370, but it's absurd as a Unix management practice and Sun should be telling people this instead of bowing to demand for it - thereby illustrating, albeit very crudely, points one and three above.

If you don't think this looks like a difficult decision, I have a challenge for you: go to sun's web site and see if you can find something there that justifies spending the extra million bucks. I don't think you'll find anything convincing - and that fact demonstrates everything that's going wrong with the company.

That million bucks really does buy you something important and valuable: the ability to treat a very large amount of memory as a single, symmetrically accessible, data store and an operating system capable of managing it with near perfect reliability for very long periods of time. There is a one wait state penalty on memory accesses outside the local processor's 4GB base address space but a 6800 maintains cache coherency across all 24 CPUs and does so at about three times the comparable throughput rate on the maxed out 16 CPU, five million dollar, z900 mainframe. Getting that to work on the 6800 and its bigger cousins the Starfires is an unbelievable technical feat that's well ahead of the competition.

Unfortunately for Sun it is also well beyond the understood needs of most of its customers. What's most striking about those Dell 6600 units isn't their 32bit memory limitation or the use of Ultra SCSI320 to achieve higher throughout per channel than a zSeries mainframe, it's that progress in building cheap, fast, Intel boxes has caught up with most business requirements. That happened partially through the operation of Moore's law with machines getting faster and cheaper, and partially because the software industry has spent ten years learning how to break work up into pieces small enough to work in Wintel client-server land. Regardless of how it happened, the bottom line is simple: those little machines have grown up to meet problem scale and are now capable enough for use as the building blocks in a corporate systems infrastructure.

Sun's machines, meanwhile, have grown off-scale relative to most business needs. There are jobs they do better than anything else on the market -think large scale business intelligence processing or very fast problem solving -click here for my Safetyjet International example. Unfortunately for Sun those markets are both relatively small and dominated by OSF/1 on Alpha users who are far more likely to go to IBM than Sun.

There's a much larger and more significant market for high speed research computing, but SPARC has both real and a self-inflicted problems there:

  1. The real problem is that SPARC defaults, as do other RISC CPUs, to the IEEE 754 64bit floating point specification and therefore doesn't get the right answers in cases requiring extensive iterative computation. If, for example, you get it to sum the square roots of the first billion positive integers using the GCC compiler and default libraries, the result is off by 0.18.

    Although completely meaningless in business processing, that error is devastating if you're trying to simulate the three body problem over a long period. You can get around this by using extended precision libraries, but that cuts performance significantly.

  2. The self inflicted version of this problem is that SPARC has an answer to floating point performance problems in its SIMD instruction set but doesn't go out of its way to tell anyone about it and hasn't given the research community the support needed to motivate use of it.

    The main problem is that researchers tend to develop for the box on their desk - and these days that's more likely to be an Intel box running FreeBSD or Linux than a SPARC. As a result, when they do run that code on a SPARC there's nothing there to take advantage of the short array capabilities provided in the hardware, and they get neither the performance nor the accuracy the machine is capable of.

So does this mean Sun is doomed? No, Sun is out of sync with the market -way ahead on large computational systems and behind on delivered, rather than possible, floating point performance. Management has to act to correct that. If they do, then Sun stands a fair chance of becoming the world's most commercially successful computer systems company, but if they chose to do a Digital or HP style self imolation instead, well, how bad do you think a few years of IBM/Microsoft duopoly would be for the country and the industry?

In the longer run Sun's current technical directions with Solaris and SPARC will give us the ability to treat a network of machines stretched across offices around the world as one machine. To run applications anywhere at any time without being concerned about resource management or security - but they have to survive to get there and, more importantly, educate the market to understand and use these new abilities.

In the long run that means rebuilding the techno-culture at Sun: deciding where to focus and doing that; getting professional services under control; building a smarter desktop -- leading the market instead of reacting to it.

Unfortunately those are longer term things, what Sun needs now is to make it through the next year more or less intact so that it can get to the longer term stuff. To do that without a lot of bench depth at Sun, Mr. McNealy is either going to have to take a lot of losses and the consequent layoffs that go with them, or land a hail Mary pass.

There are some opportunities for those. Two I particularly like are:

  1. Apple has a real problem right now. Its relationship with Motorola has been under pressure for years and they seem about to drift into IBM's embrace with a deal on using the Power4 CPU in the next generation Macs. The Power4 is nice machine, but the deal will give IBM far more control than it should have and position it for a quiet take-over of Apple's executive suite.

    Sun does not need to buy Apple to head that off; all it has to do is bite the bullet and do whatever it takes to get Apple as a partner on Solaris/SPARC. Technically that's a no brainer: the MacOS X shell already runs on SPARC and Solaris can as easily run a PowerPC board inside the box for full backward compatibility as it can run an Intel board now. Strategically a deal like this leaves IBM dependent on Microsoft and Intel for the desktop, adds Motorola to the SPARC architecture team, and gives both Apple and Sun a wonderfully coherent desktop-to-server story to tell their customers.

    This should have been done two years ago; but it's still not too late - in the short term because this may be Apple's best route to continued independence and in the long term because it indirectly brings Motorola's enormous strength in handhelds into the combined Solaris/MacOS X community.

  2. The current Linux adventure is a mistake. Cobalt looked like a good hedge, but management left and now Sun people have to manage a product line that doesn't fit their way of thinking. "All the wood behind one arrow" isn't a slogan that puts $2000 Lintel boxes in the same sales kit with million dollar servers.

    There's a better than good answer available: Gateway.

    Gateway has an established name, people who know how to sell $1,000 boxes, and a desperate need for cash. Sun has lots of cash, and a need to put some of it out of reach of a raider. So create a tracking stock for a billion dollar loan to Gateway - provided they take over managing Cobalt and become the company VA Linux set out to be.

    Turning the fourth largest PC company into an all Unix production will provide options for the anti-Microsoft crowd (no Windows licenses, no exposure to Microsoft audits, big time commercial support for open source, etc) while quite probably giving AMD the market base needed to repeat its performance when Intel introduced the Pentium-Pro - but this time against Itantium, SPARC's most important longer term competitor.

There are other, less flamboyant, possibilities too; but in the longer run what counts is management strength and focus. That's the real challenge for Sun: they need to refresh the vision, re-invigorate management, get all that wood behind one arrow again, educate their markets to the value of their products, and, in the words an executive who has successfully resurrected his technology company, once again make their company "insanely great."