By Michael Peck
With the 2020 U.S. presidential elections coming up, it is a certainty that Russia will use Internet trolls and fake news to influence the campaign. Indeed, there are reports that Russian trolls are already at work disparaging certain candidates.
But there are other reasons why the Pentagon should be concerned with fake news. False information, or doctored video or photos, can inflame foreign nations and put American soldiers at risk. But with a torrent of information and images, social media sites such as Facebook and Twitter face a herculean task in sifting false from true.
So, DARPA – the Pentagon’s cutting-edge research agency – wants to develop automated software that can spot trolls and fake news.
The Semantic Forensics (SemaFor) program “will develop technologies to automatically detect, attribute, and characterize falsified, multi-modal media assets (e.g., text, audio, image, video) to defend against large-scale, automated disinformation attacks,” according to the DARPA research announcement.
Statistical detection techniques to spot anomalies, or searching for digital fingerprints to detect fake news, is vulnerable to deception. Instead, DARPA wants to home in on inconsistencies that indicate fakes such as images that have been digitally distorted to change a person’s face.
“For example, GAN-generated faces may have semantic inconsistencies such as mismatched earrings,” DARPA explains. “These semantic failures provide an opportunity for defenders to gain an asymmetric advantage. A comprehensive suite of semantic inconsistency detectors would dramatically increase the burden on media falsifiers, requiring the creators of falsified media to get every semantic detail correct, while defenders only need to find one, or a very few, inconsistencies.”
The goal isn’t just to identify fake media, but also who did the faking. “Semantic detection algorithms will determine if multi-modal media assets have been generated or manipulated. Attribution algorithms will infer if multi-modal media originates from a particular organization or individual. Characterization algorithms will reason about whether multi-modal media was generated or manipulated for malicious purposes.”
Given the huge amount of fake news, it is not surprising that SemaFor emphasizes volume. For example, researchers developing and testing the software are expected to collect 250,000 news articles and 250,000 social media posts during the initial state. Then they will “falsify approximately 2,500 news articles and 2,500 social media posts in the first phase of the program.”
SemaFor’s accuracy will be compared to the results of human analysts trying to identify fake media. “Experiments will be designed to evaluate how well performer algorithms achieve the three main tasks (detect, attribute, and characterize), and will compare performance to human baselines. The purpose of the evaluations is twofold: first, to establish rigorous scientific protocols for measuring the performance of algorithms that reason about potentially falsified media and second, to assess the performance of a SemaFor system in realistic, operational environments.”
Significantly, DARPA wants SemaFor to be open-source software rather than a proprietary system. This will enable users to tailor SemaFor as needed as threats inevitably evolve.
And those threats will involve. Russia has embraced a “hybrid war” strategy, that gray zone between peace and war, where an adversary can be paralyzed by a mixture of propaganda, fake news and small-scale military operations. Disinformation is war on the cheap for Russia, which has entered a new Cold War with an America that has a far larger economy and military, but yet seems vulnerable to foreign political manipulation.
China and other U.S. adversaries will take note. But DARPA hopes that American technical know-how will provide to detecting fake news.