Draft Blog Entries

% fortune -ae paul murphy

Brief: 3

This is the third excerpt from the first book in the Defen series: The Board Member's IT Brief.

From 1.3 Information Integrity

1.3.1 One term, many meanings

When PC people not yet merged into the data processing culture talk about "information integrity" they're generally using the phrase as a $10 replacement for the word "security" - itself a term that has a unique meaning in their segment of the IT industry. Specifically, the PC community has a long history of susceptibility to data theft and software failure, either caused internally or through an outside agent such as a thief or a network hacker. As a result "security" to them refers mainly to measures taken to counter or diminish such attacks.

Thus PC software intended to prevent thieves from recovering data on stolen gear, to keep hackers out of networks, or to scan incoming e-mail for attack code is all described as "security" software in private and often as part of an "information integrity assurance" program in meetings with bosses.

In the IBM mainframe and Unix environments, where PC style security concerns are largely a non issue, "security" usually refers to physical security - ensuring that locked doors are locked, authorized accesses are actually authorized, and that back up tapes are made but not lost.

Although people from each of IT communities will sometimes use the term "information integrity" when considering compliance issues such as those associated with the American Sarbanes-Oxley act or, more recently, the civil discovery rules, this isn't (yet) very common. Instead these issues tend to be discussed as compliance or audit rules with some linkage to the "security" concept in the data processing and Unix worlds, but very little of that occurring yet in the PC world.

1.3.2 "The real deal" is found in Shannon's Model

In general use the term "information integrity" refers to the accuracy and completeness of information. In the specific context of accounting and other management uses of the term, however, the twin concepts of timeliness and authorization modify these basic elements of information integrity.

Thus an information system --including all automation, people, and any required manual procedures and organizational structures-- demonstrates information integrity if and only if its outputs fully and fairly reflect its inputs and do so in a timely manner without allowing unauthorized access.

Almost everything known about information integrity derives from work done by Claude Shannon at AT&T Bell Laboratories in the nineteen thirties and forties.

We can easily and directly map Shannon's general model to the extensive body of research on information integrity in automated and communications systems with direct reference to common sources of "integrity impairment" opportunities -i.e. systems and processes in which things often go wrong - including:

Data collection and entry systems;
Communications;
Database management;
Processing and related support software;
Reporting, abstraction, and summarization systems; and,
Information distribution and usage.

Click for diagram (PDF)

In Shannon's formulation, each impairment opportunity is modeled in terms of a probability distribution that, in use, allows us to answer the question: what is the likelihood that source data will be fully and correctly reflected only in the messages arriving at the destination with no unauthorized delay, copying, additions or deletions?

This formulation allows us to ask the obvious management questions:

what are the costs and benefits of risk reduction for each source of risk?
who is directly responsible for risk assessment and remediation? and,
how lossy (likely to lose accessibility, control of data, accuracy, or completeness) is the overall (combined) process over longer periods of time?

It is therefore possible to model the cost of meeting specific information integrity requirements such as those mandated in any standard CFO job description (or the Sarbanes-Oxley legislation) using exactly the same process as you would for any other business investment decision: quantify the risks, assess the probabilities, work out an expected net cost of loss, compare that to the expected cost of remediation, and make the indicated decision.

An Example

Suppose the ICell Company uses an e-commerce application of significant importance to its revenues. Suppose further that a single ten hour database shutdown is expected to cost ICell one million dollars on net and it is known that competitors using the same computing technology suffer, on average, two such outages a year.

In other words the undiscounted expected annual cost of failure is two million dollars. Now suppose that replacing the people and technologies in place with a more reliable combination, one expected to fail no more than once every two years, would cost $500,000 initially and no more to use than the existing technology.

In this example the alternate technology reduces the expected annual cost of failure from $2,000,000 to about $500,000 without increasing operating cost - making the decision a no brainer, but in other cases you may need to look at the expected cash flows in some detail before making a decision.

Some notes:

These excerpts don't include footnotes and most illustrations have been dropped as simply too hard to insert correctly. (The wordpress html "editor" as used here enables a limited html subset and is implemented to force frustrations like the CPM line delimiters from MS-DOS).
The feedback I'm looking for is what you guys do best: call me on mistakes, add thoughts/corrections on stuff I've missed or gotten wrong, and generally help make the thing better.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.