Draft Blog Entries

% fortune -ae paul murphy

The relational mistake

If you've ever wondered what killed 4GLs the answer is a mistake - one that permeates almost everything in IT from about 1969 through to today.

Codd's relational model wasn't designed as a solution to data storage problems, it was designed to address application stovepiping and used common data storage as the methodology for doing that.

Codd called his model "relational" because he'd used textbooks written using British English while in school - textbooks in which what we call set theory is labeled the theory of relations.

Unfortunately most data processing professionals were and are familiar with the entity relationship models used in heirarchial databases like IMS, so they're sure they know what relational means - but they're dead wrong: there is no relationship between entity-relationship constructs and Codd's relational model.

Back in the late 1960s one of the key emerging problems facing IBM's System 360 customer base was that executives were starting to notice that data entered in something like the Payroll package generally had to be re-entered for use somewhere else - like the GL. The reason for that was known as stove piping - the tendency for each application designer to create data definitions and matching files that worked only with the particular batch job they were designed for. Combine that with the standard data processing approach of building what we call applications, like a GL, as collections of rigidly ordered batch jobs and these groupings quickly became the data equivalents of towers of babel -structures that look like 1880s industrial smoke stacks, or stovepipes, in systems diagrams.

Codd's solution was simple: define a common data store, enforce commonality through access rules, and minimize error, duplication, and the resources needed for data management by storing data in the minimal relations (i.e. "sets" in modern (American) mathematical usage or "tables" in IT usage) necessary.

Out of that came the entire Future Systems software architecture - eventually released, minus the, by then discredited, client-server component, as the System 38. That system envisaged applications as windows into a shared database - and for that reason came with a report generator (RPG) and not COBOL as the primary development language. RPG enables that Window to the database - a model in which user accessible applications are envisaged as CRUD frontends to a database that protects its own integrity, and nothing more.

Very few people in IT understand this but it's why System 38 applications have stood the test of time, why mainframers hate it, and why the Unix relational 4GLs were so insanely great - because they combine the real rational model with the user freedom afforded by the power, cost, and underlying community model for Unix.

So, bottom line, is there hope? Sure, but the key reason those people I talked about yesterday think it utterly asinine for me to claim that that I could redo their whole applications suite in sixty days while entirely removing the database and OS dependencies, the existing security vulnerabilities, and the maintenance cost headaches is simply this: their applications and the entire worldview and architecture on which they're built reflect applications thinking based on the entity relationship model. And that means there's a zinger: this was a mistake that has cost the industry and its customers years, lives, and billions in part because it was easily perpetuated in the single threaded world of simple minded application programming.

So now what, can this be perpetuated in tomorrow's multi-threaded applications models? I think it can, but the efficiency costs will increase by quite a lot and that, in turn, spells opportunity for those long dead Unix 4GLs to re-emerge as hardware change forces application design change down the throats of today's experts.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.