Personal Disasters

Not all IT disasters, of course, are processing related - some are just personally embarrassing. For fun, therefore, I thought I'd purge two of my more moronic moments and invite you to present some of your favourites.

The one I can't not remember first came in the late eighties. What happened was that a senior partner in the Calgary office had asked for a Unix expert to help with some clients running a particular Oil and Gas royalties management package on NCR towers. After touring a half dozen client sites the partner involved was vaguely starting to think I might know something about the subject when we got to a guy who said his system had been running more and more slowly over the last couple of months and was now functionally unusable.

So as The expert, I was expected to wave a magic wand and fix this - and it turned out to be blindingly obvious too: hundreds of zombies left by a shell pipeline that was aborting during the last stage. Easy, except, nobody knew the root password, or even who had it. So I wrote a fix under a user account for their admin to put in, and ventured that the quickest way to get rid of the zombies was simply to reboot.

Great, except they didn't want to wait until the only guy who might have the password came back from holidays, so after some discussion I warned them of the risks, expressed my faith in NCR Unix, and plugged the plug.

Two Long minutes later, it's up and running - I'm sweating a little less and... the zombies were still there!

Huh? if it doesn't work, do it again, right? so a second reboot later people are acting like I have leprosy because the zombies are STILL THERE..

A few minutes of red faced hemming and hawing later the obvious clicks in: NCR towers had battery backed memory so they came up exactly where they went down. Flip the switch, reboot: no more zombies and normal performance again - but that partner never called me again and neither did the client.

And then there was the Vancouver episode...not the one in which I unsuspectingly told a secretary who knew my preference for FedEx to ship a 70 pound monitor back to the factory for repair ($700+, but overnight!) but the one in which an entire office system disappeared.

That office had some weird people - back in the early nineties an unhappy employee had disconnected and hidden the Ascend P50 connecting them to UUnet and therefore to me; and few years later a temporary office manager had hired a PC guy to "reload" Windows NT on the SCO Application server -while 16 or so people were using it and had their work rerouted to Calgary via nothing more than a dual ISDN connection to UUnet at each end.

So when I got the call: nothing's working - no applications, no X-terminal start-up, no internet access, nothing- I was instantly suspicious. I did the obvious, connected to the router, by then an xDSL connection to the local metronet, to see what was what. It was fine, but reported the Sun 250 unreachable - and that corresponded with the user report that the X-terms weren't booting.

And yet there was traffic on the router, somebody surfing the net from the HP laptop we kept around because the office manager thought carrying it made his sales guy look more professional to clients.

I got a lady from Sun in Vancouver to go have a look: her report? the 250 was fine, somebody had disconnected all the cables at the hub and everybody came back on line as soon as she plugged them in again. Great, but I still couldn't see it on the router, they still lacked external access, and traffic on the thing had stopped.

I asked her to check its cabling ... except she couldn't find it; so pretty soon I was upset, she was upset, and half the office was looking for something that had to be flashing its pretty little lights right in front of their noses because I was damned-well logged into it.

By then somebody had called the guy who owns most of the company - and he was telling me he could care less about why the office was off the net and I should get my ass an airplane and damn well go fix it - so next morning I did, and when I got there things got weirder: because there was the hub, and there was the wall plug, but there wasn't any router or cable between them - and my spare router? The metronet refused to accept the connection because the account was in use.

So I got them to kill the link -and that wasn't easy, but they eventually did and I got my users back on-line using the spare.

By then of course I'd missed my planned return flight -and having a very late lunch by yourself on a hot day in downtown Vancouver isn't anymore uplifting a social experience then ordering a turbaned cab driver (one of those rejoicing in his inalienable right to engage in a lifetime work to rule strike) to go back to the office while practically in sight of the airport.

Blindingly obvious, right? A day and a half gone, a PO'd former friend at Sun service, $2,000+ in expenses, and a lot of upset people before I finally think to ask the right question: where's the HP?

Answer: on loan to some lawyer two floors down...


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specialising in Unix and Unix-related management issues.