Draft Blog Entries

% fortune -ae paul murphy

From Chapter one: Data Processing and the IBM Mainframe

This is the 6th excerpt from the second book in the Defen series: BIT: Business Information Technology: Foundations, Infrastructure, and Culture

Note that the section this is taken from, on the evolution of the data processing culture, includes numerous illustrations and note tables omitted here.

Roots (part four: The System 360)

As a commercial machine the 360 was an enormous success and continued IBM's almost complete dominance of the electro-mechanical data processing market into the automated data processing era.

Although there are many theories about why and how the 360 gained market dominance, ranging from vague assertions about conspiracies to careful research on the impact of IBM leasing policies, the truth is that the 360 was both functionally more attractive and easier to buy than its competitors in this field.

The IBM System 360 was easier to buy than its competitors for two reasons:

Because essentially all data processing people reported to senior financial managers who knew and trusted IBM as a company they were used to dealing with; and,
Because IBM alleviated service and failure concerns by leasing out, rather than selling, the machines and then built availability guarantees into those leases by including on-site hardware support.

Thus the people who made the decision to get the first commercial 360 gear were generally the people in charge of the existing data processing adjuncts to corporate or organizational financial accounting units. To them, automatic data processing was a natural step forward from the electro-mechanical tabulators they were comfortable with, and buying from IBM just continued a long standing corporate tradition.

The key applications were, of course, the same as the ones the mechanical tabulators had been applied to: highly repetitive tasks like making new journal entries, calculating payroll amounts, totaling waybills, or processing insurance information. Thus those same repetitive clerking tasks became early targets for software development.

Where the software for this came from IBM - which employed many of the very best developers in the industry - or an IBM vendor partner, it tended to be both simple and effective. Thus a company could lease a 360 and be running GL batches less than ninety days after delivery, but could not expect the application to seamlessly handle foreign currency transactions.

The financial argument for this type of clerical task automation was compelling. A mid-range IBM 360 could offer financial break even by replacing about three hundred clerks. More significant benefits came, however, from its ability to get the work done both more accurately and more quickly, while freeing financial management from having to deal with a large number of clerical workers.

To many senior finance people the 360 looked, therefore, like a gift since it could achieve break-even on clerical automation with just one application - typically GL or Payroll - and then paid for itself again on another application such as billing.

In most cases the direct savings - replacing 600 clerks at $33,000 per week with a single machine at about $120,000 per month produced savings of about $12,000 per month because the original data processing people stayed - wasn't as important in the long run as the reductions in errors, time lags, and the overheads involved in applying battalions of clerks to the same work.

As a result most financial officers were more than willing to fund additional development proposals, provided only that the success or failure of the new effort not affect the reality of the benefits already being received.

Writing complex COBOL applications looks easy because programs made up of a few simple steps are easy, but scaling up to more complex programs is actually extremely difficult. Thus a COBOL process that looked like:

Clear the data registers
Load a record from the file
Read data from that record and temporarily store it in a register
Unload the card
Read a related card from another file
Move the data from the register to the new card
Unload the modified card back to the file
Loop back to the clear register step until all cards in the first file are done

made perfect intuitive sense to traditional data processing managers because they'd been doing things in exactly that way for nearly sixty years -embedding skills honed through three generations of managers.

Unfortunately they had no idea of the complexities created when this simple looking process is applied to real problems and implemented in the COBOL/360 environment.

From the "shark tank" (funny snippets) section of Computerworld's Dec. 16/02 issue

This big-deal R&D lab has a sequence of mainframe jobs that are just taking too long, so management calls in consultant pilot fish and his partner to speed them up -- no matter what it takes. "The system passed data from one step to the next, and at each step, the data was sorted using a very efficient algorithm that nevertheless added hours to the overall job," says fish. "After an hour of careful listening, we presented our solution, which easily cost us tens of thousands in consulting fees: Stop sorting the tapes. After each of the five steps, the data was resorted to the same order!"

Note, however, that because the processing undoubtedly depended on the sort order, actually taking this advice would have been as stupid as the writer tries to present this client as being.

Many of the key software designers working at IBM and a number of its business partners did understand the issues. Most had significant academic credentials in one of the sciences, Mathematics, or Engineering. Thus many started out with an understanding of what it takes to produce working software because they could apply their broader, research based, knowledge to develop the fine grained process understanding needed for success.

Those skills led them to design the simplest, most generic, applications possible.

In contrast, most of the data processing managers who suddenly undertook corporate code development projects had none of these advantages. Few had the necessary education or experience to understand how COBOL scales in complexity and most of the projects they undertook were non-generic and therefore more difficult to specify and complete.

Many people, furthermore, saw relatively difficult problems like insurance claims processing as much simpler than they really are because their previous exposure to the business had been filtered through Finance and the limitations of what Finance management thought could be done with electro-mechanical tabulators.

By way of comparison...

Contrary to its business success, the 360 was a complete failure as a scientific and research machine outside IBM's most committed customer base, and essentially took the company out of the supercomputer business for the next thirty years.
The 360/75, introduced in 1965 had a storage capability of up to 1,048,576 bytes. The machine had a memory cycle time of 750 nanoseconds, (ns). The 1961 CDC, in contrast, had up to 256K of 80bit words (2.6MB) and a memory cycle time of about 240 ns.
The Model 91, introduced in 1966, could have up to 8MB of storage and was the fastest 360 until the model 195 in 1970. Its CPU cycle time (the time it takes to perform a basic processing instruction) was 60 nanoseconds. Floating point adds took 1-2 cycles, multiplies 4-5, and divides 7 - 9 but, by then, CDC offered array processors with 25ns cycle times and the ability to do multiple arithmetic operations in parallel.
Some military variants of the 360/370 line included custom hardware designed to address this through larger memories or better floating point, but sales were made on customer loyalty rather than performance.
On the other hand the 360 was much better at I/O than the CDC 6400- leading to the oddity that some reasonably small programs would take about the same time to run on the two machines - because these would load in minutes but run in seconds on the CDC and load in seconds but run in minutes on the 360.

Even those few who knew they didn't know, couldn't find enough real experts to help them. The 360 sold in unprecedented numbers with each sale creating a requirement for anywhere from thirty to three hundred staff qualified to fill roles such as "System Programmer." Those people simply didn't exist in the numbers needed.

Worse, non IBM people with genuine computer expertise to offer, mainly academics and recent engineering graduates, had experience only with real numerical computing and not with the largely unrelated disciplines of automatic data processing.

From a distance the 360 looked powerful and seemed to offer advanced features such as disk drives and a resident operating system, but people whose computer experience had been gained working with numerically oriented machines like the CDC 6400 quickly found their assumptions about the resource mix - fast CPU and memory, slow I/O - reversed when their work collided with the 360's COBOL architecture.

The System 360/Cobol combination did work well for those who straddled the computing and automatic data processing fields but had the rather odd effect that sorting became the pre-eminent System 360 application for the sixties and seventies just as, for related but different reasons, middleware became the defining System 390 application in the eighties and early nineties.

Consider, for example, the problem of processing thousands of claims for payment under a medical fee for service plan. To do this you start with four main files:

A file of physicians qualified to bill under the plan;
A file of allowable services showing the amounts due for each;
A file of eligible patients; and,
A file consisting of claims to be processed -each of which identifies a physician, a service, and a patient.

Today you'd just read the files into an associative array; do the arithmetic; and push out the results; but, with something like 64K in available main storage that wasn't remotely feasible on System 360.

The right answer in the System 360 type of highly restricted environment is to combine file sorting with batch processing:

First sort the claims file and physical files together by physician id (looked up and entered at keypunch time to avoid having to use physician names, which can have duplicates). The resulting file will have the eligible physician and claiming physician on adjacent records (cards).
Then process that output file to remove records originating from the physician file, output records from the claims file that don't match a physician to an error file, and output the rest to an in-process file.
Do the above steps twice more to remove claims for unauthorized services or for which the claimed fee is incorrect;
Sort the final in-process file according to both the physician id and the patient id so that the physician record appears just at the head of each group of patient records in which this doctor is claiming service fees; and,
Process that file to accumulate total fees due each physician.

Each processing step not involving a sort typically needs to hold no more than three records (cards) in memory at once. For example the last step reads the physician id from a group header record, then reads the next record. The pre-sorts ensure that the first one will be a patient record, so it is used to initialize the payment record for this physician.

Records are then read, checked for type, and the accumulator increased until the next physician record is encountered. At that point, the current physician id and accumulator information is written out, and a new cycle started.

Used properly, the 360 exactly duplicated the old electro-mechanical processes but was much faster - offering the ability to efficiently process up to several hundred thousand claims per day in this way and, by the late seventies, most medical claims in Canada were indeed being processed on 370s.

Unfortunately the combination of COBOL's apparent ease of use with the power of the 360 led many organizations to believe they could effectively harness the system to achieve significantly more sophisticated processing.

Thus, as a result of factors including:

Enormous payback from the first "low hanging fruit" projects;
Missing or inappropriate expertise;
The inapplicability of the COBOL card sorting model to complex tasks;
The apparent but misleading ease of use offered by COBOL; and,
The limitations of the 360 as a computational device,

thousands of organizations launched massive systems development projects -nearly all of which failed.

In effect what happened was that qualified computer people from outside the data processing industry tended to produce code that simply could not be run on the System 360s, while code produced by the unqualified generally failed because their perceptions of both COBOL and the applications were simplistic.

On the whole failure rates in excess of 95% had very little impact on systems management in part because the people responsible were generally deeply embedded in the Finance departments they worked for, but mostly because the machines had generally been brought in by Finance executives searching for ways to reduce cost and error on highly repetitive tasks like payroll, ledger entry, or some form of claims processing. As long as that core service was reliably delivered, failures elsewhere were not typically seen as important to Finance - whatever their actual cost or importance elsewhere in the organization.

One consequence of this was the rapid evolution of managerial means to ensure processing continuity for those key applications.

For example:

Capacity planning almost instantly became a critically important input to the systems budget process;
Maintaining 24 x 7 operations became critical to delivering printed reports (the traditional data processing role conceptualization) on time; and,
Rigid role separation and extensive committee reviews became important organizational controls on changes that threatened to hiccup the regular production schedule.

On the project side, furthermore, data center managers learned the value of user involvement and the committee structure as a way of diffusing responsibility while, more importantly, ensuring that all but the most blatant failures could be declared successes. Combined with the hard lessons learned in keeping production running, the committee system led to the evolution of many different systems development life cycle (SDLC) methodologies and the widespread imposition of extensive documentation requirements for the smallest system change.

Coal to Newcastle, anyone?

In most data centers new hires are put to work on code maintenance while more experienced people work on new systems development. This may seem reasonable, but seriously reduces overall productivity and project success rates.
Why? because the experienced people got that experience with the languages and technologies embedded in the "legacy" systems, but spend their time trying to adapt unfamiliar new technologies in the their development work while the younger new hires, to whom these new technologies are "native", have to work with obsolete languages and technologies they don't understand and don't want to use.

By the early seventies so many organizations had been burned, often badly enough to draw senior management's attention, that restrictions on change spawned a whole new discipline: maintenance programming, designed to get around those restrictions.

A maintenance programmer does code development, but development aimed either at making minor improvements in something that works, or at making changes aimed at keeping something working as the circumstances surrounding its use change. Maintenance programming grew out of the hacker tradition initiated by people who modified IBM's Type II, or sample, applications to achieve local objectives.

Since those sample applications were generally good enough to get the customer up and running within ninety days of systems delivery, the initial hacks tended to be limited to low complexity things like embedding the corporate name in report headers. As a result these activities could reasonably be assigned to fairly junior staff while seniors planned further systems development.

As production systems became both more critical and more complex, however, the risks became greater and the shelter offered by development assignments correspondingly more attractive. As a result maintenance programming remained the province of the newbies and perceived organizational losers long after it became the highest risk, and most difficult, part of daily operations.

Today maintenance programmers are the data center staff most likely to work with actual users as they remediate application problems and develop minor system enhancements. This, of course, usually requires new code development, but on live systems instead of isolated test environments and demonstrates once again how organizational presures can lead to counter-intuitive results: in this case that the people usually ranked lowest in the data center's professional hierarchy tend to be the ones making the most important contributions to the organization's daily operational success.

---

Some notes:

These excerpts don't include footnotes and most illustrations have been dropped as simply too hard to insert correctly. (The wordpress html "editor" as used here enables a limited html subset and is implemented to force frustrations like the CPM line delimiters from MS-DOS).
The feedback I'm looking for is what you guys do best: call me on mistakes, add thoughts/corrections on stuff I've missed or gotten wrong, and generally help make the thing better.
Notice that getting the facts right is particularly important for BIT - and that the length of the thing plus the complexity of the terminology and ideas introduced suggest that any explanatory anecdotes anyone may want to contribute could be valuable.
When I make changes suggested in the comments, I make those changes only in the original, not in the excerpts reproduced here.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.