North Korea continues to rattle the nuclear saber. Just how potent is the DPRK’s nuclear arsenal? Can North Korea hit the United States with a nuclear weapon? In order to do any of this, proper testing would need to be done. It is with these questions that we present the latest from our friends at 38 North, where this piece first appeared, who ask the question: Did North Korea test a nuclear device in 2010?
In 2010, two radionuclide stations in Northeast Asia detected radioactive particles that seemed to indicate that a nuclear explosion had taken place. While there are other possible explanations, other evidence seemed to suggest that North Korea had conducted a very small and otherwise undetected nuclear test. In the past few years, there have been a number of studies of radionuclide data, seismic data and now, on 38 North, satellite imagery.
While some of the evidence is intriguing, I don’t buy it. My objections are largely methodological—and methodological objections are important to me. Everyone who does analysis will be wrong from time-to-time. I try to be methodologically cautious so that, when I inevitably get it wrong, I will still feel like I made the right judgment based on the evidence available to me.
I think the hypothesis that North Korea conducted a nuclear test in May 2010 is a reasonable one worth considering. North Korea has conducted three nuclear weapons tests, presumably reducing the size and mass of the nuclear device, fixing whatever went wrong in 2006 and possibly confirming a design using uranium. It is possible that, along the way, North Korea conducted a low-yield science experiment or simply tested a dud.
Frankly, I’d love to be the person who proves that North Korea conducted a secret nuclear test. But, based on the evidence we have, I just don’t think it is more likely than not.
The 1979 Flash in the South Atlantic
First, a little history. In many ways, the debate over whether North Korea conducted a May 2010 test reminds me of a similarly ambiguous event in 1979.
The 1979 “flash in the South Atlantic” was precisely that—an optical detector called a “bhangmeter” on a US satellite detected a flash of light that looked a bit like a nuclear test somewhere in the South Atlantic. There was a lot of circumstantial evidence that pointed to Israel as the culprit. And I don’t mean ‘circumstantial’ in an insulting way. I mean that the prospect of a covert Israeli test seemed then, and still does today, totally plausible based on all kinds of evidence.
After the flash, the scientific community started scrutinizing every pool of sensor data to find the slightest corroboration. A few interesting things turned up, but nothing conclusive. There was some hydrophone data, but it required a sound wave to bounce off of Antarctica. There were claims of radioactive sheep thyroids that I’ve never been able to confirm. And so on. A full review of the evidence is beyond the scope of this little essay, but the approach raised a methodological concern. Spurious correlations are a statistical fact. A 90 percent confidence level means you’re still getting fooled 10 percent of the time. So, if you look hard enough for corroboration, you will find a few things, even if they are spurious. As a scientific panel charged with reviewing the data concluded:
We surmise that had a search been made for corroborating data relevant to a nonexistent event chosen to occur at a random time, such a search would have provided ‘corroborative data’ of similar quality and quantity to that which has been found during analysis of the September 22 signal.
To put it simply, one must be careful to avoid collecting coincidences that support a hypothesis while ignoring data that undermines a hypothesis.
Ultimately, the scientific panel decided to reject the hypothesis that the bhangmeter had seen a nuclear test for a simple, elegant reason: the satellite’s bhangmeter, like a pair of eyes, was two sensors, which saw different events. If something is far away—like on the surface of the earth—the two sensors are close enough that they should see the same thing. The fact that the two sensors saw something different, the panel reasoned, suggested the flash occurred in space very near to the satellite and not on the ground. This was an elegant answer. It also persuaded no one. People just simply accused the scientific panel of covering up for the Carter administration, Israel, etc.
I feel precisely the same way about the alleged May 2010 nuclear tests. As in the case of Israel in 1979, I have no trouble accepting that the DPRK might have conducted a nuclear test in May 2010. But, as in the case of 1979, the assembled evidence seems to be merely a collection of coincidences that we could collect for a nonexistent event on a randomly chosen day.
At the core of this problem is a reversal in how we think about detecting underground nuclear tests. The traditional thinking is that the correct way to “detect” an underground nuclear test is to spot it seismically. If radionuclides later appear, that helps “characterize” the seismic event as a nuclear explosion, rather than a conventional one. Generally, policymakers have been reluctant to rely only on radionuclide readings alone to “detect’ events for reasons that should become clear. The radionuclide community, however, is very excited about getting the same recognition as seismologists, especially now that computer simulations promise reliable methods to model the transport of radionuclides based on weather data. So, there may a bit of a disciplinary food fight here.
In May 2010, the DPRK released a series of statements that a “thermonuclear” reaction had occurred in April. In the months following the announcements, a well-respected Swedish radiochemist, Lars-Erik De Geer, correlated these statements with certain radionuclide readings collected by the Comprehensive Test Ban Treaty Organization’s (CTBTO) International Monitoring System (IMS). The data includes xenon isotope ratio measurements at a national radionuclide monitoring site near Geojin (South Korea) and an IMS site near Takasaki (Japan) and Barium/Lanthanum measurements at CTBTO IMS sites near Usurriysk in Russia and Okinawa in Japan. (Only Lanthanum (La) was detected at Ussuyriysk.) All these measurements occurred between May 13-18, 2010.
De Geer published his findings in a 2012 article in Science and Global Security. I was skeptical of the original De Geer paper because it posited an extraordinarily artificial scenario of the observed radionuclide readings. De Geer posited two undetected nuclear tests, conducted in the same chamber approximately one month apart.
A number of radiochemists reviewed and agreed with De Geer’s initial paper. One concluded that the evidence suggested a nuclear explosion, although he argued the radionuclide evidence was best explained by a single explosion and dismissing the xenon detections at Takasaki in Japan as coincidental.
De Geer himself concluded that the initial paper was in error, publishing a second paper in the Journal of Radioanalytical and Nuclear Chemistry. While De Geer’s first paper posited two undetected tests, the second paper posits only a single test on May 11.
Now, there are two ways to respond to this revision: I took it as confirmation of my original complaint that the scenario was being fitted to the data, raising serious methodological warnings. My colleagues, quite reasonably said, “Yeah, but the new scenario is pretty clean. What’s your objection to it now?”
Then along came another group of radiochemists, Ihantola et al, who agreed that a nuclear explosion occurred, but estimated the likely time of the event to be much later than De Geer’s estimate. De Geer and Ihantola et al posit very different explosion times, each outside of the error range posited by the other. Only the confidence intervals overlap, and just for a few hours.
So, here we are. Is De Geer right? Are Ihantola et al right? Or do we just shake our heads, muttering about how the data, like Jay from Serial, seems to always tell us what we want to hear? I don’t blame Wright for concluding that some of the readings might be unrelated to a test, but once we start tossing out awkward data, our thin methodological ice starts to crack.
Moreover, false alarms are possible. Nuclear power stations, reprocessing plants and other human events can result in releases that appear to be nuclear explosions. Early operation of a radionuclide monitoring system in Germany detected xenon spikes from nearby reactors. (The false alarm has led to better methods of characterization that emphasize isotoptic ratios, but these methods still struggle to distinguish an explosion from a fresh load of fuel.) In another instance, in 2004, a radionuclide station detected 140La that was later determined to have been from a military decontamination exercise. We are so worried about false negatives—missing a nuclear test—but we seem to never worry about false positives.