% fortune -ae paul murphy

Lessons from a riddle

Cousin Dan works at the farmer's market where he sells sausages. From the brim of his ten gallon hat to the soles on his cowboy boots, Dan measures exactly two inches more than the 70 inches between his outstretched fingertips. What does Dan weigh?

If you can't figure that out, ask any five year old - they'll gleefully tell you that Dan weighs sausages.

One of the basic principles of scientific investigation is that you should never ignore any information available -even if it doesn't appear to have anything to do with the issue.

So how does this apply to IT? I've been researching compiler theory as part of an effort to figure out what it would take to effectively use Sun's SMP/CMT technologies. As part of that process I've been looking at the notion of self optimizing code -and discounting most of what's been written about it as neither realistic nor useful.

Suppose, however, that you want to design a compiler that looks like an execution application running in server mode - it sits around waiting for you to throw code at it, executes that code, and logs what it does to produce the "compiled" executable. Great, but terribly inefficient if you don't find something useful for it do while waiting for work to appear.

So the question comes down to this: why throw away priceless run-time information by ending that compiler-code relationship once the executable file is ready to run?

On Solaris, with dtrace, one relatively interesting option is to have the compiler track and monitor the code it produces with an eye to optimizing it while in use. Within the local system this is relatively straightforward -and remember the compiler doesn't have to instrument the code because dtrace instruments the kernel instead. So imagine you've got your compiler watching what branch instructions its creations actually execute and re-ordering code to make sure those always come up first.

The biggest bang for the buck on static functionality is going to come from reductions in cache misses - particularly for CPUs, like Microsoft's Nexen PowerG5 adaptation, that aren't heavy on branch prediction hardware -but it's a lot easier to visualize operation in much simpler cases.

Imagine, for example, that you have an eight way CASE statement in your code and all of these come up, but number seven comes up eighty percent of the time in your organization's use of the application. Thus simply putting it first avoids six checks eighty percent of the time - like expression ? expression : expression but done over larger code blocks and differently for different users.

The gain here trivial, but free, so why not do it? - and, more importantly, why not go ahead and do it for major user groups like individual customers or departments within customer organizations?

Today, of course, there is no such compiler but there is a direct application to things like word processor design. Since there's a lot of truth to the claim that 90% of the features are ignored by 95% of the people 99% of the time, an implementation that combined the loadable module idea with dtrace could be rather nice. It could, for example, turn a 5MB word processor executable that does everything for everybody into a lightweight speedster that learns to do just what each user wants, with no more than a very small time penalty as new functions are first brought in temporarily and then compiled in if used repeatedly.

There's a long run bottom line for developers here: you know what they say about pennies? information works the same way: save a cycle here, a cycle there, and pretty soon what you have is a way of setting things up so your people develop and maintain a very generalized, one size fits all, application that looks fast and lightweight to the customer even while automagically tuning itself for that customer.


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.