% fortune -ae paul murphy

Chapite! it's the truth

The biggest problem with using the internet as a means of accessing information is that it's extremely difficult to tell truth from falsehood.

Do a google search on almost any issue of consequence and you'll find materials written from many different perspectives and for many different audiences - and no tools to reliably help you tell them apart.

With google you can limit your searches to the .edu domain, to "legitimate" journals, or to opinion blogs - but none of those things come with any guarantees and you can't really evaluate the results for yourself without first knowing quite a lot about both the subject and the publishing environment involved.

The most generic component of this problem is something I think of as time scale discrepancy. What this refers to in this context is the time scale on which information change manifests and ranges from hourly in the political blogs to decadal for peer reviewed journals.

It's easy to find examples of the consequences imposed by time scale discrepancies between publishing communities: the mass media want highly simplified, politically loaded, global climate change information on a daily basis, but the emerging consensus on heliocentric causation has taken decades to develop academic momentum - meaning that public perception and policy making on this set of issues is basically an off-shoot from a direction not taken in the science. Similarly groklaw reports anti-sco events and interpretations on a daily basis, but the courts move at a glacial pace with judgments coming months apart - making the information available to those outside the legal professions as out of joint as the time scales the two groups operate on.

Using existing search mechanisms to find something applicable to almost any question is trivially easy, but sorting through potentially hundreds of hits to find "the truth" is so difficult that most people simply don't bother - thus creating the real risk that the average information seeker is now more likely to be misled than ever before.

Suppose, for example, that you wanted to know the mass of the earth - pretty simple, right? On the most superficial level it is: check wikipedia for example, and you'll find a rough approximation to a consensus (1970s) estimate. What you won't find, however, is any indication that this number is actually known to be wrong.

Pursuing the matter via internet resources is almost impossible - you'll get hundreds of "hits" on any reasonably formulated search and find the process of deciding which ones are credible enough to read, and then weighing what you do read, extremely time consuming - in part because you'll have to take time off to study planetary physics to understand much of the more apparently credible material.

Now it's obvious, of course, that most people won't care that the earth gains about 40,000 tons of stellar and inter-stellar matter every day - after all, how likely are you to want to precisely locate the Voyager II probe? - but the problem is as general as this example is specific. There's no good way for those who aren't domain experts to tell gold from dross, insight from incitement, or analysis from bullshit.

There have been many attempts to provide something, but from slashdot and digg to Yahoo's human mediated indexing and google's specialized search domains, they've all been overwhelmed by volume, by agendas, and by regression to the opinionated.

So where's this going? I think there are two trends: and they're going in opposite directions. On the one side we're seeing tightly controlled sub-nets like physnet that are peer reviewed and peer controlled but inaccessible to those lacking the professional credentials required to interpret what's said in them. On the other we're seeing massive reader recruitment into internet mobs like the one collected around the daily kos, categorized by us-against-them barriers and the replacement of thought and thoughtfulness with emotion and rhetoric.

What this leaves is an enormous opportunity in the middle: an opportunity to present search results to people like you and me: organized by publication date and credibility. Google is looking at organizing hits by timeline, but I don't know of anyone who's doing anything useful about search hit credibility.

And yet, it's a market gap - and what that means is that someone will successfully fill it.


Paul Murphy wrote and published The Unix Guide to Defenestration. Murphy is a 25-year veteran of the I.T. consulting industry, specializing in Unix and Unix-related management issues.