America's Disappearing War Data

America's Disappearing War Data

Crucial statistics from the Iraq and Afghanistan wars aren't being systematically stored, leaving lessons unlearnable.


The historical record of the Iraq and Afghanistan wars is being lost—and with it, the opportunity to learn from our mistakes and successes.

The Iraq and Afghanistan campaigns are unique in that they were the first wars to be documented electronically. The use of computers to track stabilization efforts produced enormous datasets in which important indicators were tracked, including daily electricity-production rates, georeferenced insurgent attacks, factory employment numbers, military spending on locally sourced goods and services and public opinion. These data serve not only as the foundation of the historical record of both conflicts, but offer researchers opportunities to study insurgency and terrorism in ways previously not possible.


Unfortunately, the data are at risk.

Army Secretary John McHugh recently admitted to members of Congress that thousands of records from the Iraq and Afghanistan conflicts are missing. The army’s admission of losing track of data resonates with our experiences as both Defense Department officials implementing counterinsurgency programs in Iraq and now as researchers seeking to understand which programs succeeded in reducing violence levels and which did not. The problem is that much of the existing data were collected in an ad hoc manner that reflects the lack of planning for stability operations following both invasions. While certain data types were methodically maintained, others were kept by single individuals in more arbitrary ways—in some cases, on a single computer’s hard drive, in a personal computer or within an e-mail account. As flash drives are lost, computers reformatted, files erased, and human and magnetic memory degrades, various data types have been and will continue to be destroyed.

Yet the portion of these data that have so far been made available to scholars has enabled them to generate insights with important implications for current U.S. foreign policy. Equipped with advanced statistical methods, researchers from institutions including Stanford, Princeton and MIT who have managed to gain access to particular subsets of these data have begun to piece together the effects of various counterinsurgency programs. We now know, for instance, that military-funded projects in Iraq tailored to satisfy the needs of target communities were often effective in reducing insurgent violence, while there is virtually no evidence of such effect through billions of dollars in spending on large infrastructure projects like bridges and electrical grids.

Other results derived using the data are likely to give defense strategists and military planners reason to pause before committing taxpayer dollars. A growing body of work indicates that, contrary to widely held assumptions throughout both conflicts, poor economic conditions do not appear to account for violence levels. Academics were surprised to discover, for instance, that areas in Iraq and Afghanistan with the highest levels of unemployment experienced some of the lowest levels of insurgent violence. Before joining the White House as President Obama’s chief economist, Princeton professor Alan Krueger used data from Iraq to show that terrorists typically do not to come from low socioeconomic backgrounds.

These critical insights are only the first findings, and many opportunities for research remain. The effect of numerous large-scale programs implemented in both countries has yet to be tested, while more general questions related to the conditions under which individuals opt to join insurgencies or to engage in terrorism persist. Unfortunately, without directed efforts by the Defense Department to locate, centralize, and develop efficient access policies by which researchers can use these datasets, we may never know whether particular U.S. counterinsurgency programs helped coalition forces defeat insurgents or why some Iraqi and Afghan citizens turned to terrorism while others refrained from violence.

Current counterinsurgency doctrine rests in large part on the experiences as retold by individuals engaged in prior conflicts in places like Vietnam, Malaya and Algeria. As similar discussions are had with Iraq and Afghanistan combat veterans, Provincial Reconstruction Team advisors and others over coming years, conversation is sure to devolve to the minutia of how researchers might try to recover lost disk drives, read obsolete file formats, or track down particular contractors who had maintained specific data for the government but have not been heard from in years.

These are not passing concerns. Robert Gates remarked during his tenure as U.S. Defense Secretary that the “challenges we have seen emerge since the end of the Cold War—from Somalia to the Balkans, Iraq, Afghanistan and elsewhere—make clear we in defense need to change our priorities to be better able to deal with the prevalence of what is called ‘asymmetric warfare’… the mainstay of the contemporary battlefield for some time.” Far beyond helping construct the historical record of the Iraq and Afghanistan conflicts, ensuring that these data are archived and made available for research efforts will ensure that lessons of the past decade are known to U.S. policymakers grappling with the implications of terminating billions of dollars worth of spending in Afghanistan and how to support efforts to stabilize states like Syria and maintain peace in countries where it is most precarious, like South Sudan and Côte d'Ivoire.

Fortunately, the same technological advances that gave rise to this problem can help solve it. Much of these data could be collected and centralized at little cost to the government and made available to the community of scholars interested in learning from the U.S. experience in both countries. Where issues of classification restrict wholesale data release, technological procedures are available to expunge sensitive information without diminishing its usefulness for research purposes. Individual observations in survey data, for instance, can be manipulated to make it nearly impossible to infer an individual’s identity from responses, while still preserving the aggregate patterns from the data set as a whole.

Data aren’t silver bullets, and researchers need to evaluate carefully the inferences they draw from it. Data can contain errors, the effects researchers examine may have a source in something that has not been measured, and analysis can fall prey to a number of statistical errors. And some questions of interest simply do not lend themselves to even the most sophisticated statistical approaches.

But those are good problems to have—ones that become relevant only after data are loaded into researchers’ computers, and ones that the publication review process is designed to address. Without these data, however, we stand little chance of ever making sense of the effect of various U.S. wartime initiatives and their implications for U.S. policy over the foreseeable future.

Gerry Brown is the executive director of the Institute for Economic Stability and former director of corporate development for the Pentagon’s Task Force for Business and Stability Operations. Andrew Shaver is a doctoral candidate in Princeton University’s Woodrow Wilson School of Public and International Affairs and former staffer within the Office of the Secretary of Defense.

Image: Flickr/mazlov. CC BY 2.0.