Belmont Club

Database of databases

Although “database journalism” is said to have started in 1952, when CBS used a UNIVAC to help analyze election results,  the widespread use of data queries and mining to generate stories only started relatively recently.  In 1988 “Bill Dedman of The Atlanta Journal-Constitution … received the Pulitzer Prize … for his … investigation, The Color of Money, which dealt with mortgage lending discrimination and redlining in middle-income black neighborhoods. By 2009, the word “computational journalism” had come into existence. Data mining and search algorithms were proclaimed the future of investigative journalism.

The rise in computational journalism paralleled the decline in access reporting, in which a journalist’s value was directly proportional to the number of people who would give him “access”. As James Hamilton at the DeWitt Wallace Center for Media and Democracy at Duke University was realizing that intelligence agencies were detecting terrorists by analyzing information patterns and signals, traditional newspapers were laying off the Woodward and Bernstein wannabees by the thousands. “Deep Throat” was losing out to “Data Mining”.

But as those who tried to connect the list of Dead People to the voter lists in each state can attest, data is far from being either easily accessibly or readily relatable without a major investment in time and money. This has naturally given rise to efforts to turn the identification of projects into an open-source effort.

If you don’t believe me, take a look at the Web site for the Sunlight Foundation, which has founded or funded a dizzying array of projects aimed at revealing “the interplay of money, lobbying, influence and government in Washington in ways never before possible.”

Because it supports so many fascinating ways of tracking the performance of politicians and government agencies, I can only hint at how inventive Sunlight’s agenda actually is. But here’s one entertaining hint: The foundation’s Party Time site tells you where Washington’s politicos are partying (and, very often, raking in campaign contributions). I clicked in a few weeks ago to see that the Build America PAC, a committee that seems to give primarily to Democratic causes and candidates, was cordially inviting people to join Congressman Gregory Meeks, D-N.Y., at the fourth annual Las Vegas weekend retreat at Bellagio early in December. For a suggested contribution of $5,000 for a co-host and $2,500 for a sponsor, a retreat-goer could get a private reception with Congressman Meeks and a “Special Guest,” not to mention a discounted room rate at Bellagio. I know these particulars because Party Time includes PDFs of the invitations to political partying.

It wasn’t long before a bunch of us tried our hands at creating similar functionality. The Database of Database project is now up and ready to roll, in a pre-Alpha state. It has links to real government directories and datasources that can be queried by anybody and is therefore actually useful right out of the box. More important, anybody can sign up and begin creating links to datasources of his own. Over the next couple of weeks, the Database of Databases project will add up the contributions of those who have contributed links and be rewarded with having their chosen screen names listed in a Top Salesman of the Week style fashion.

We want to add “Quests” to the mix. Quests will enable the user community to set goals for what amounts to large online Easter Egg hunts. Naturally those who embark on these quests will be suitably recognized by the community.

This is altogether too lighthearted for serious journalists, who see technology as one way to rescue a trade whose prospects are in decline. John Mecklin quotes “Irfan Essa, a professor in the School of Interactive Computing of the College of Computing at the Georgia Institute of Technology … who … is often credited with coining the [computational journalism].”

He says both journalism and information technology are concerned, as disciplines, with information quality and reliability, and he views the new field as a way to bring technologists and journalists together so they can create new computing tools that further the traditional aims of journalism. In the end, such collaboration may even wind up spawning a new participant in the public conversation. “We’re talking about a new breed of people,” Essa says, “who are midway between technologists and journalists.”

One possibility which Essa might consider is that information technology may destroy journalism as a profession in its current form. Once the tools for computational journalism are sufficiently developed then anyone in principle may operate them. Nothing inherently restricts them to ‘journalists’. Most importantly, if access is open then anyone in principle can decide what an ‘important story’ should be. It is this aspect of the revolution which may have the biggest consequences. When anyone can start a narrative, then the Narrative disintegrates in favor of memes which more spontaneously spring into existence.

But enough of that. Anyone who wants to give the Database of Databases a try is free to do so at — and feel free to participate in the discussions and report bugs.

Tip Jar or Subscribe for $5