The Christian Science Monitor explores the issue of whether something produced entirely by aggregation should become secret and whether the Washington Post, by bringing together disparate pieces in a story detailing the top secret world of America, may have produced in its survey a review that is more than the sum of the parts. The Monitor writes:
The Post’s two-year investigation into the nation’s massive post-9/11 security buildup was constructed almost entirely from public records, according to the paper. But in a larger sense the project may have produced an overall picture that the US government would consider classified, had it produced such a report itself.
In recent years the US has consistently pushed a “mosaic theory” of intelligence gathering. This holds that individually harmless pieces of information, when combined with other pieces, can produce a composite picture that reveals national security vulnerabilities.
“Under the mosaic theory, even if the individual pieces are part of the public domain, a particular aggregation of data, or method by which the data was compiled, could in fact be classified,” says Stephen Vladeck, a professor and expert in national security law at American University’s Washington College of Law.
It is certainly true that any intelligent aggregation and synthesis of disparate data is valuable. The process is often referred to as extracting information out of data. Combining the pixels to form a picture. Connecting the dots. It is a very familiar process. Dots by themselves have little intrinsic value. It is how they are connected that give them most of their value. The Defense Security Service recently sent out a bulletin to the dots alerting them to the Washington Post article.
Early next week, we expect the Washington Post to publish articles and an interactive website that will likely identify government agencies and contractors allegedly conducting Top Secret work. The website is expected to enable users to see the relationships between the federal government and its contractors, describe the type of work the contractors perform, and may identify many government and contractor facility locations. …
We recognize that this information can be put to legitimate use. However, without a doubt, foreign intelligence services, terrorist organizations, and criminal elements will also have interest in this kind of information. It is important that companies continually review their overall security posture to ensure that it meets required standards. We recommend that companies affected by this publication and website assess, and take steps to mitigate, risk to their workforce, facility and operations. These steps should include reinforcement of security and counterintelligence (CI) protections, and a dedicated effort to enhance workforce awareness of threats.
But does heightened awareness among the dots offset the disadvantage of being placed on a map? Questions over the the dangers posed by the sheer availability of data have been raised in connection with Google’s tireless mapping of darned near everything. Google has been snapping up information on the apparent premise that its sheer acquisition sets off a process analogous to creating a critical mass of fissile material. Put enough ordinary stuff together and suddenly the particles they share among each other creates a chain reaction that goes boom. The potential of aggregating and indexing information may be as unknown as the effects of the first Atomic Bomb. Before the Trinity Test was conducted “a betting pool was also started by scientists at Los Alamos on the possible yield… ” because no one really knew what it would do.
Yields from 45,000 tons of TNT to zero were selected by the various bettors. The Nobel Prize-winning (1938) physicist Enrico Fermi was willing to bet anyone that the test would wipe out all life on Earth, with special odds on the mere destruction of the entire State of New Mexico!
Fortunately it did not blow up the world. But while we know the limits of critical mass in uranium , we may not know it about information yet. Today the familiar little Google street view cars tootle round the world, equipped as we now know not only with panoramic cameras, but an unseeen ‘camera’ for recording network information from home networks as they bowl along. While everything they collect may be in the public domain (the view from the street) there are concerns that enough ordinary information adds up in the end to something extraordinary. Already the case can be made that the Internet itself, along with its associated search engines and databases, are a far greater espionage tool than the Washington Post article itself, which is simply an a subclass of the parent. Journalists have already remarked the Washington Post “story” is particularly groundbreaking (for a newspaper) because it takes the form of a visualization tied to a database. The Washington Post says, in a kind of introspection, that “as others have begun to note, the Post’s editors broke with convention by publishing the series on Monday, instead of in the Sunday paper, to reach a broader national audience who read The Post online. Even more, it’s the form of Dana Priest and William M. Arkin’s Post investigation that shows the potential for something new: It takes years of research and turns it into digestible pieces for the click-happy dilettante readers of the Internet. ” Flowing Data spelled it out: “of main interest: a network diagram shows organizations and their top secret activities and a map shows the geographic distribution of government organizations and companies within Top Secret America. ”
Click on a specific organization for within group breakdowns. At this point it gets a little confusing with drill-down pie charts, especially if you’re just browsing, and a spiral view is also offerred which feels extraneous. The overall story and heavy research, however, makes it worth clicking through the clunky at times set of interactives.
All this was supposed to have been done for the public good. The Washington Post makes a not entirely convincing case that the American Top Secret system has grown so big that nobody can find the secrets. By laying out the map of the sprawling and Byzantine empire the newspaper can claim the public policy goal of highlighting a real intelligence weakness. The WaPo argues that there is so much secrecy that no real secrets can be found in this mountain of whispers:
Two “super users” in the department told the Post that it’s impossible for them to keep track of the mountains of top-secret info they’re exposed to. “I’m not going to live long enough to be briefed on everything,” one said. … Agencies are collecting so much data that they don’t have enough translators or researchers to analyze it …. Turf wars among agencies can prevent the sharing of information.
So in a way the Washington Post agrees with the “mosaic theory” since it uses it to prove its point. Here’s the mosaic and it proves the intelligence system is too complicated. But it cannot now turn around and say that its mosaic, so valuable in producing its political conclusion has no value to the enemy. After all, any blade will cut both cheese and chalk. Ironically the best defense of the security establishment against the Washington Post expose may have been to keep growing because the best way to invalidate the cache is to change the data. Once the 854,000 who the WaPo says have top secret clearances have doubled in number, then the SVR will only have half the picture.
The key resource in a world which relies on aggregating data to product information isn’t the dots but control over the aggregator. Using the Internet itself as a model of an information structure, the battle between the spider and the web is a never ending one. It appears that information is growing faster than the spiders can scuttle around it. In 2007 the searchable Internet was believed to contain more than 15 times larger than all the information contained in the Library of Congress. But it was a mere fraction of the actual information “out there”. Some estimate that up to 99.98% of the information potentially available is for various reasons beyond the reach of the spiders. This is the Darkweb which is even beyond the possibility of aggregation.
Maybe the greatest comfort that national security agencies can derive from the story is that Dana Priest and William M. Arkin wrote it. That makes it self-limiting. These two will never be able to keep the intelligence “story” cache current. The real threat would have been if the Washington Post had turned their effort into a kind of open source project or adopted some form of information collaboration which would have multiplied the spiders. When journalism does that then all the SVR will have had to do from here on is surf the web, unless we create a web in which the servers somehow know what and what not to display depending on who’s asking. The biggest intelligence challenge of future may not be compartmentalizing information — keeping the dot as dots — so much as making the dots aware of who they are talking to. If mosaics are necessary to intelligence then some way must be created for the pieces to become conscious of what they are becoming part of. Collaboration means nothing unless it is intelligent collaboration. Every party that comes to an information dance should ideally know what it is going to get and convey what it will potentially bring. How one can do this in an online world, where we increasingly rely on online information and reputation to determine what a thing or person is and what he is doing will be one the grander challenges of the near future.
Looking for the light of a new love
To brighten up the night, I have you love
And we can face the music together
Dancing in the dark, dancing in the dark
Dancing in the dark