Pages: pp. 88
Abstract—How effective are computers and software at filtering and sorting surveillance data and other databases for government intelligence? The role of computational science in training pattern-recognition software for big surveillance is considered.
Keywords—pattern recognition; face-recognition algorithms; big surveillance; scientific computing
At the height of its powers, just before the collapse of the communist regime that had nurtured it, East Germany's secret police, the Stasi, engaged 1 in 63 of the country's population in spying, either directly as employees or indirectly as informants.
Besides the departments called the Main Administration for Struggle Against Suspicious Persons, which was charged with the surveillance of foreigners, and Administration 12, which was responsible for monitoring mail and telephone communications, the Stasi also had a Division of Garbage Analysis.
As you can imagine, a surveillance effort on that scale generated a vast amount of data. One estimate put the total at around 1 billion pieces of paper. Although the Stasi was a formidable organ of foreign espionage and domestic repression, its effectiveness was limited by its inability to process all—or perhaps most—of the data it collected.
Modern intelligence services benefit from the power of computing to store and process surveillance and other data. The first time I became aware of what might be called big surveillance was in the summer of 2007. On 30 June of that year, two men, Bilal Abdullah and Kafeel Ahmed, drove a Jeep Cherokee loaded with tanks of gasoline and propane to the main entrance of Glasgow International Airport in Scotland. The attack failed. Security bollards blocked the car from entering the terminal building. Ahmed was the only person who died (of burns). Only five people were injured; none seriously.
The hunt for possible co-conspirators got off to a quick and apparently successful start. The day after the attack, police used the UK's network of closed-circuit TVs and their automatic number plate recognition (ANPR) system to locate an acquaintance of Abdullah and Ahmed, Mohammed Asha, as he was driving on a stretch of the M6 motorway 250 miles south of Glasgow. It turned out that Asha wasn't a co-conspirator, although he had loaned money to Abdullah and Ahmed. Still, the ease with which his car was found was chillingly impressive.
The extent to which big surveillance helped authorities identify Dzhokhar and Tamerlan Tsarnaev, the alleged perpetrators of this year's Boston Marathon bombings, isn't clear. Having picked out two suspicious people in surveillance videos, the FBI presumably would have preferred to identify them by applying face-recognition algorithms to driver license records and other government databases. Instead, the FBI sought the public's help, thereby warning the Tsarnaevs that they were suspects.
Regardless of the power of the CPU expended or the size of the databases trawled, finding Asha and the Tsarnaev brothers boiled down to matching a known object with a large collection of other objects. But what if you don't have something to match in the first place? Identifying suspicious activity then becomes a problem of correlating objects in a database and looking for anomalous patterns.
I can only guess at how such a pattern-recognition program would work. One approach could be to establish a pretend terrorist cell (staffed by agents) that conspires to set off a bomb. Whatever activities were captured by routine surveillance would be retrospectively analyzed for telltale patterns. If any real terrorists also happened to be under surveillance, their activities would also help train the software.
Would such a pattern-recognition program violate civil liberties? Proponents might argue that the initial detection doesn't rely on knowing anyone's identity; warrants could be applied for after suspicious activity had been flagged. On the other hand, a government would be wise to obtain the consent of voters for such a program, as the controversy around Edward Snowden's leaks continues to demonstrate.