Recent days have provided us two posts eyeballing the GDELT data, one of which , at Political Violence @ a Glance, is problematic (see my two part response here and here), and another of which, by Alex Hanna (@alexhanna) at Bad Hessian, is useful, but nevertheless suffers from a considerable weakness that is very widely shared among academics interested in the study of contentious politics, sub-national analysis of violent political conflict, dissent–repression, and events data more generally. As I have much to say, this is part 1 of two parts, the second of which I will post tomorrow. Here I explain the unrecognized but fanciful assumption that underpins virtually every study of the type that Hanna reports. Recognizing, and then abandoning, that assumption unfortunately turns out to be a bit scary. But there are two solutions, the first of which I discuss below. In Part 2 I will offer another solution to the scariness that is especially appealing. So stay tuned.
Fountains of Youth and Pots o’ Gold
One of the things that most scholars who become interested in events data seem to implicitly assume is that it is possible to observe (count) the events of interest. Put differently, they appear to be unaware of the fact that for the vast majorities of events researchers have an interest, a census is not possible. The values that these concepts take are literally unobservable. It follows that searching for ways to count the values of these variables is akin to searching for a pot o’ gold at the end of a rainbow or the Fountain of Youth.
To appreciate why this is so, consider the following thought experiment. Imagine that you wanted to describe the central tendency and dispersion (e.g., mean and variance) of civil disobedience (aka non-violent protest activity) across countries, at an annual level of temporal aggregation, during the period from 1950-2010. Researchers who study such phenomena rely on media accounts and/or government reports and/or testimony collected by (international) non-governmental organizations, and it should be apparent that these records are incomplete. Indeed, how might one generate a census of civil disobedience?
Here is a fanciful proposal: one could post a network of human research assistants throughout the territory of interest such that each assistant was within visual sight of at least one other assistant, and have a sufficient number of such assistants to cover not only the space involved, but three eight hour shifts. As long as one is willing to assume the existence of a valid and reliable set of coding rules, and further that the assistants would be diligent, then during the period of time they observed the research team could be confident that it was collecting a census of the civil disobedience activity that occurred during that time and space.
I hope that this rather fantastic thought experiment provides the reader sufficient
stimulation to suggest to her that all contentious politics data collected by social scientists
are incomplete. To generalize beyond contentious politics, we argue that when we collect
data about the behavior of human beings we ought to carefully consider whether (1) the
people taking the action usually have an incentive to either hide or exaggerate their behavior, and/or (2) other people have an incentive to make it difficult for those not present to learn of the behavior.
Virtually everything written about the creation of events data fails to engage this issue, and the literature suffers for it. That is, the authors who have done this work appear to implicitly assume that the goal is to produce a census, or as close to one as they can possible get.
Imagine, for a moment, that economists had taken such an approach to measuring the concept Gross National Product (GNP). This incredibly widely known concept suffers from the same problem: it is not possible to construct an actual count for the GNP over any interesting spatial — temporal domain. As such, economists are left to estimate GNP. Unlike those of us who work with contentious politics events data, however, to the best of my knowledge all of the economists who work on estimates of GNP recognize that they are working on estimates: they are explicitly aware that there is a non-trivial measurement problem that cannot be boiled down to two (multi-dimensional) issues: sources and coding protocols. Unfortunately the literature on events data has yet to make that realization, and we remain mired in flawed discussions of the (important!) source and coding protocol issues that, though they warrant investigation and debate, cannot be fully discussed well until we fix the problem.
What’s the Implication for Studies that Compare Sources?
That said, there is a considerable literature out there that does what Hanna did in his post: compare two different data sets that code some contentious politics concept. And we have learned much of value from it! See, for example, Snyder & Kelly (1977), Franzosi (1987), Martin (1988), Olzak (1989), McCarthy, McPahil & Smith (1996), Oliver & Myers (1999), Sommer & Scarritt (1999), Oliver & Maney (2000), Maney & Oliver (2001), Poe, Carey & Vazquez (2001), Davenport & Ball (2002), Koopmans & Rucht (2002), Almeida & Lichbach (2003), and Earl, Martin, McCarthy & Soule (2004), among many others. Summarizing what we have learned is a bit daunting, and at minimum justifies a distinct post. So you are stuck with my assertion (or lots of reading).
Unfortunately, as I have noted, aside from Davenport (2010) virtually none of this literature recognizes that a census is not possible; that they are producing an estimate of an unobservable, or latent, concept. Hanna’s study was done with Pam Oliver and Chaeyoon Lim, the former of whom is a co-author of two of the above studies. And while there are many specific things we have learned from them, this type of study is fundamentally flawed because it implicitly assumes that there is an observable, non-latent, count of event Z that we can code. That simply isn’t so. As such, the correlation, or partial correlation, or whatever measure of association, between the count of event Z in event dataset A v that in event dataset B is of rather limited interest and value. Indeed, barring the same coding scheme and sources, going into such an analysis with the expectation that they would be strongly associated only makes sense if one invokes the implicit assumption that there is an observable count we code code. Once we abandon this fiction the raison d’etre of such studies is severely diminished. The reason is that we already know, from the above, a fair amount that will prove useful should we want to model the dual processes that produce the reports of such events, a topic to which I now turn.
OK, so what now?
There are a variety of ways forward from here. For example, if one is interested in a count of event Z that is independent of any particular theorizing or hypothesis testing, then we must recognize that we need to estimate a latent, unobservable variable. Full stop. Do not pass “Go,” do not collect $200. No defensible alternative exists.
How might that be done? We must embrace the latent variable challenge we are stuck with, and measurement models are an excellent option by which we can do so. There are lots of ways to develop measurement models, and I will describe two I know of. In tomorrow’s post I will discuss a theory driven (theoretically laden?) alternative to these theoretically bereft ones.
Multiple systems estimation is the first and, for the present, the leading, alternative. The Human Rights Data Analysis Group (HRDAG) leads the field. In addition, Cullen Hendrix and Idean Salehyan presented a paper at the 2013 Peace Science Society that takes a multiple systems approach to develop an estimate of the events they code in their Social Conflict in Africa Database (SCAD) project.
The second approach is to create theory-driven measurement models, which have the downside that they will suffer from specification error and, more precisely, the estimates from these models can be unstable to different specifications. Nevertheless, what might such a model look like? The starting point is recognition (the assumption?) that two distinct processes produce the events that show up in (news, INGO, government, etc.) reports. First is the process that interests us: the processes that lead human beings to gather in groups and challenge states (and other groups), and that govern state’s (anticipatory and reactionary) responses to such challenges. Second is the process by which a biased subset of such activities (by both dissidents and states) find their way into various natural language sources we might subject to content analysis.
A reader is well within reason to express concern that this is complicated. You bet. Now get over it. Pretending that we can code the ground truth about event type Z is dead on arrival. It is time for us to accept that and take on the challenge of doing the difficult theoretical and statistical modeling needed. And the sooner we get to it, collectively(!), the better off we are.
Keep in mind that the preceding section is a discussion only about creating an estimate of the true count. There is an easier, more theoretically interesting, approach available to those of us who wish to use events data to test hypotheses drawn from our theories. Since I am almost always in that world in my own work, I favor this approach, which is an explicitly theoretically driven solution. That is, when we are testing the hypotheses implied by our theories we are not interested in a count of event Z independent of any theoretical perspective. Our conceptualization of event Z is part of our theory, which is to say it is not theory independent. More specifically, the vast majority of us who use events data to study contentious politics are interested in the choices that actors make in conflict. Tomorrow I will describe a straight forward, widely applicable set of assumptions that anyone working with such theory can make. Doing so eliminates the latent/unobservable problem described above. It is really pretty cool. And more of us need to start adopting it.
 Some concepts, such as social revolutions, genocides, and wars as defined by the COW project, are sufficiently rare that a census of these events over finite temporal and spatial domains is feasible. But there are few such events.
 I developed this thought experiment following my first year of graduate school, in the summer of 1987, as I was struggling to make sense of why it was that academia did not have a census of violent political conflict data, despite such projects as . That governments have a strong disincentive to collect such data occurred to me pretty quickly, but that only explained why governments did not collect and disseminate such data. It did not explain why academic projects had the rather obvious sample selection problems that they suffer from. The thought experiment produced the “Aha!” moment that explained the outcome and allowed me to abandon the Quixotic search for the Fountain of Youth.
 The choice of civil disobedience is, of course, arbitrary. The thought experiment works equally well should you substitute any of the following: terror attacks, riots, armed guerilla attacks, government (sponsored) disappearances, people tortured, extra-judicial killings, people raped, evidence of genocide, and so on ad infinitum.
 Please note that I have yet to mention source bias, one of the two issues that most dominates discussions about the reliability and validity of contentious politics events data.
 For example:
Kristian Lum , Megan Emily Price and David Banks (2013). Applications of Multiple Systems Estimation in Human Rights Research. The American Statistician, 67:4, 191-200. © 2013 The American Statistician. [online abstract] [free eprint may be available] DOI: 10.1080/00031305.2013.821093
Jule Krüger, Patrick Ball, Megan Price, and Amelia Hoover Green (2013). “It Doesn’t Add Up: Methodological and Policy Implications of Conflicting Casualty Data.” in Counting Civilian Casualties: An Introduction to Recording and Estimating Nonmilitary Deaths in Conflict, ed. by Taylor B. Seybolt, Jay D. Aronson, and Baruch Fischhoff. Oxford University Press.
Daniel Manrique-Vallier, Megan E. Price, and Anita Gohdes (2013). “Multiple-Systems Estimation Techniques for Estimating Casualties in Armed Conflict.” in Counting Civilian Casualties: An Introduction to Recording and Estimating Nonmilitary Deaths in Conflict, ed. by Taylor B. Seybolt, Jay D. Aronson, and Baruch Fischhoff. Oxford University Press.
Anita Gohdes and Megan Price (2013). “First Things First: Assessing Data Quality Before Model Quality.” Journal of Conflict Resolution, Volume 57 Issue 6 December 2013.