Draft Blog Entries

% fortune -ae paul murphy

Hey, whatcha you looking at, sales dude?

If your job is to proactively find terrorists hiding inside a large civilian population you can't adopt the usual police strategy of waiting for the crime to be committed and then looking carefully at everyone remotely in the emotional or commercial vicinity of the event. Instead what police usually do is treat everyone in the population as suspect, try to guess at some set of parameters describing individuals likely to be of interest, and then look carefully at everyone matching the criteria derived from those parameters.

The problem with this, of course, is that setting the criteria too broadly wastes manpower and is generally considered unacceptable on civil libertarian grounds, while setting them too narrowly can defeat the purpose of the excerise.

Among the many responses in widespread use among American agencies with responsibilities in this area the least invasive has been the use of pattern hunting applications to trawl through communications and financial transactions data. Basically what these systems do individually is see who connects to whom and at what level of remove - a high stakes version of the game in which people try to enumerate the intermediaries needed to connect one movie star to another -and together, of course, they look for cases where a trail in one is broken in the other.

From a libertarian perspective what's important about this kind of approach to the problem of population scanning is that the contents of the communication or transaction have nothing to do with the pattern match and only become of interest after the system identifies one or more individuals as worth a closer look. Basically pattern trawling can be thought of as imposing a very small cost to privacy for almost everyone, while profiling and its consequences impose very high invasion of privacy costs on smaller target groups.

It's not even obvious, fact, whether the idea of privacy even applies to transactions or communications carried out over public infrastructure if the snooping discloses only the existence of the connection and not its contents.

In other words, the biggest payoff this approach offers in addition to comprehensiveness and low cost, is much broader political acceptability - we are comparing, after all, having the police work off-line with bank and telecom connection records to having them interview the friends and neighbours of everyone whose colleagues or relatives have traveled to Pakistan within the last five years.

And here's the zinger: If that payoff survives current court challenges in the United States, there are waiting commercial applications for the same technology.

Right now, for example, marketing people can use big on-line data warehouses to see what products get bought together by city, region, credit rating, or any one of a dozen or more other parameters -like in store location. Nothing wrong with that, right? it's exactly how the police operate in criminal cases: wait for a 7/11 sales event (guy buys beer) and look at individual motivation (it's 8:35PM and guy buys a bag of Pampers too).

But here's a thought: the data's out there to go beyond that, all the way to content free pattern discovery among all consumers, not just the store's customers - because finding and outing the hidden terrorist isn't that different from finding and selling the customer you haven't seen yet.

In fact here's a free prediction: we'll know it's happening when a national or major regional chain demonstrates its ability to predict the physical flow of a product fad across both its own customer base and the people who don't normally come to its stores well enough to be neither over nor understocked in any store at any point in the process.

Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.