Kerfuffle, Human Rights Data Stylee

A post yesterday at openGlobalRights has generated frustration and ire among several of us who work in the data production (and analysis) salt mines.  Lawrence Sáez, a political economist at U London’s SOAS, opined thusly under the headline “Human rights datasets are pointless without methodological rigour.”

In general, I am very supportive of the effort to analyze changes in human rights protections from a cross-national perspective. I believe that dataset initiatives, like CIRI, can help us develop a more nuanced understanding about such trends. However, at present, the CIRI dataset suffers from significant methodological problems that may make it useless for any meaningful statistical analysis.

Pointless?  Useless?  Yup, those be fightin’ words.

Sausage

The hyperbolic tone of Sáez‘s post is unfortunate.  Sure, hyperbole generates clicks, but it also polarizes.  Polarization is fine when one is preaching to the choir and mobilizing one’s side against an opposing side.  While I have no interest in peering into Sáez’s head and backing out his goals, as one who produces and uses the human rights data it is challenging to read his post as something other than an attack.  And the unfortunate part is that all the reasonable/useful points he makes get lost in his presentation.  They further reinforce inaccurate biases that people who neither understand data collection nor believe that data, and especially the statistical analysis of it, can teach us anything useful about human rights.  When one includes phrases like “As a quantitative political scientist…” and concludes that data are pointless and useless, well, you be the judge.

You can’t post that!

Several of my colleagues have posted comments to the post, and Sáez added a response.  I want to call attention to some criticism leveled at oGR for posting the piece.  Amanda Murdie concludes her comment:

It is disappointing to me that a piece like this would be published at OpenGlobalRights.

Chad Clay, who recently contributed a post to the series this one is a part of, wrote:

Overall, this post feels beneath the standard of openGlobalRights. The claims made here would not withstand peer-review at the vast majority of journals, and yet, it was published on this popular website without any real vetting of the information it contains.

Those remarks made me wince a bit.  Why?

As the commenters point out, Sáez is ignorant.  If we set aside his tone, the post is a sophomoric effort to describe problems that definitely do exist in conflict data generally and human rights data specifically.  The problem, from the vantage of experts like myself and my colleagues who commented, is that scholars like Sáez do not avail themselves of the opportunity to move beyond the sophomoric critiques that all of us who use human rights data develop when we first engage such data.

But should we call on a blog like oGR not to publish such posts?  I don’t think so, and have weighed in on that here.  Many of my colleagues will disagree with my view, but it seems to me that we are better off viewing such posts as an indicator of the challenge we face educating our colleagues who do not study human rights data (about which I write below).  I get frustrated by posts like Sáez‘s, but I view them as teaching opportunities.  Calling for censorship strikes me as a bad idea: doing so will drive the ignorant underground and make it easier for us to think we are doing a great job teaching others about how to think profitably about human rights data.  Yes, I very much want people who read oGR to understand what we do, but as I argue below, what we do is complex and technical, and those who are new to the topic–even when they are trained quantitative scientists–will inevitably have reactions akin to those Sáez expressed in his post.

So, yes, Professor Sáez is a quantitative political economist and he has the training and skill needed to use statistics and data to do useful science.  However, like all of us when we first encounter a new topic domain, he has a Pollyannaish expectation of what data collection is likely to look like, and is thus horrified by the apparent compromises made by those who have collected data.  One hopes Professor Sáez would not walk into a chemistry lab working on measurement at the nano scale, and then write an incredulous  blog post about the fantastic assumptions made to assign values to observations, but perhaps he would.  He made that error here, and those of us who see it have a responsibility to point that out.

But I cannot support arguments that a blog post is “below standards” and should not have been posted.  When it comes to peer review publication, yes, that is absolutely correct.  But blog posts are not refereed and it is not reasonable, in my view, to imagine that they might be.

Then there is the issue of hyperbolic critique.

In a post on the issue of “mutual respect” in academic blogging I wrote:

what do you think about professional ethical conduct in the blogosphere?  Quixotically wishing it away is not a conversation that interests me, but I want to encourage you to think about whether the CA problem should be addressed, and if so, how?

My own rule, thus far, has been to permit myself wide latitude when I am posting on my own blog, but limit myself to a more professional (less conversational) and more collegial (less snark) tone when writing on a collective blog.  Is that a good/useful rule?

Of course, we are the norm entrepreneurs of the academic blogosphere.  None of us know what function academic blogging should perform, nor what norms should govern its conduct.  It seems to me it will emerge from practice.  But it also seems to me useful to use this kerfuffle (and others like it) to raise the question and encourage others to weigh in.  What norms do you advocate?

Newbies and human rights data

Let me now return to the issue of researchers familiar with data who have never thought about collecting human rights data.  Happily, there is both blogged discussions and published research that investigate the topic and offer guidance to those who are new.

Back in 2014 I wrote a pair of snarky posts under the title “Two Rubes Walk into a Bar, Order Event Data”  and followed it with another pair of posts I titled “No More Fountains of Youth/Pots of Gold: Conceptualization and Events Data.”[1]  I also co-organized (with Christian Davenport) a pair of meetings on creating conflict data, and in 2015 we released a Creating Conflict Data: Standards & Best Practices document for researchers to follow (link to PDF).

The four posts above, and the standards and best practices document, are too esoteric to be of value to people who do not have basic training in statistics, theorizing in science, and an interest in conflict data.   But they are precisely the sort of thing that someone like Sáez could usefully engage.  He would learn, among other things, that researchers who study human rights and use data have been publishing peer reviewed work on the types of concerns he describes since 1986.

That said, I am but one person who has contributed useful blog posts of these issues!  Happily, Anita Gohdes tweeted a nice dust devil in response to Sáez. and identified several of them (though hardly a comprehensive list).

Gohdes1

Gohdes2

Gohdes3

Gohdes4

Gohdes5

Gohdes6

Gohdes7

Gohdes8

Gohdes9

Gohdes10

Gohdes11

Gohdes12.png

I’m Out

If I can muster the energy and block the time I will draft a specific response to to Sáez to try to find some wheat in his pile o’ chaff.  Don’t hold your breath, but hopefully I’ll follow through.

@WilHMoo

Correction: In the initial post I botched  my capitalization of the acronym for openGlobalRights.  I’ve corrected that, replacing “OgR” with “oGR.”

[1] Not all events data are human rights data, nor are all human rights data events data.  Nevertheless, there is considerable overlap, and the issues I discuss in those posts are relevant for human rights data.

About Will H. Moore

I am a political science professor who also contributes to Political Violence @ a Glance and sometimes to Mobilizing Ideas . Twitter: @WilHMoo
This entry was posted in Uncategorized. Bookmark the permalink.

12 Responses to Kerfuffle, Human Rights Data Stylee

  1. Will, I think you are missing the larger point. No, of course I am not calling for restrictions on speech. Saez, you, whoever is fine to write whatever they want. I am calling for openGlobalRights, a blog that does referee posts and takes a very strong hand at editing submissions, to do a better job at soliciting and checking posts for errors. In my opinion, I see the Saez post as openGlobalRights deciding to publish “fake news” in order to get blog hits. It does not represent the current state of human rights data or academic literature; it’s not even close.

    I and many of my colleagues are working hard with NGOs in the human security community. For example, I just started a project with an NGO on forced marriage and I presented to hundreds of INGOs in Geneva last week about my research. openGlobalRights’ decision to publish Saez’s post could makes these collaborations more difficult. It brings unnecessary and inaccurate doubt about the validity and rigor of current academic literature to the minds of the practitioner audience it is trying to reach.

    I stand by my statement for the post to be retracted. If the post would have been submitted to the blog I am a permanent member of, Duck of Minerva – a blog with far less of a practitioner readership – I would have voted that it not be posted because of these errors. I hoped openGlobalRights, as a blog with a mission to connect practitioners and academics, would have done the same.

    • Will H. Moore says:

      Thanks for taking a moment to explain your position a bit more, Amanda! I believe I follow the larger point. We just don’t see eye to eye about it. And I may well be wrong about my beliefs (and I do try to update when I am). Let’s see if I can identify what I think drives our different views.

      First, and you might consider this splitting hairs, but I distinguish between edited and refereed. I consider refereed a system where the editors solicit the advice of people who are expert in an area, and then make decisions about whether to publish based on their reading in conjunction with the reading of the referee(s). In my limited experience OgR is not refereed. It is, as you note, edited strongly. And I may be mistaken about the refereeing. If so, hopefully someone will correct me.

      Should blogs like OgR be refereed? I do not believe that they should. Of course, what happens with that over the next decade is anyone’s guess, but I like blog space as a distinct space.

      Second, you are quite reasonably concerned that ignorant, sloppy, or simply inaccurate posts like Saez’s will make it harder than it already is to work collaboratively with (I)NGOs. That might well be so, and my knowledge here is small-N and biased (largely drawn from my network with HRDAG and a few of the science-oriented staff at AI and HRW). But while I agree that the risk exists, and stipulate that posts like Saez’s are unhelpful at best, I doubt that the risk that concerns you is actually much of a risk.

      You and I might have rather different beliefs about the friendliness of human rights (I)NGO staff to statistical analysis and data collection, as well as different beliefs about the likelihood of a shift in that distribution in response to a post like Saez’s (or Amelia Green’s or my own at OgR). I doubt these posts have much of an aggregate impact, but let me try to develop my views in a bit of detail.

      To fix ideas, let’s imagine the distribution of scores from a feeling thermometer survey question asked of a random sample of OgR readers about their sentiment toward the value of data collection and statistical analysis in human rights work. My own belief is that the median value over the range of 0-100 would likely be around 20, possibly with a mode below that. But I am a pessimist.

      Given the hyperbolic tone Saez took I find it difficult to imagine that very many readers would update their beliefs: those with scores under 40 are likely to still have negative views, while those with scores over are likely to wrinkle their nose at the tone and be dubious about the merits of the post. But I may well be mistaken.

      Now, I can imagine someone arguing that lots of the OgR readers are low-information types (I am using partisanship and voting literature lingo here), and note that low information types are the most likely to update their beliefs in response to new information. This type of argument would support your view: let’s imagine that 60% of the respondents are low information types whose views are pretty flexible. A post with that headline and conclusion would certainly drive down the thermometer score of these readers, and some of those people might be involved in their (I)NGO’s efforts, and thus induce the collaboration difficulties you envision.

      I certainly cannot say that does not happen. It may well happen, and if so, it certainly strengthens your case for retraction.

      But if that is the true state of the world, then don’t those of us who are not burdened by Saez’s ignorance have an opportunity to use his post as a foil, grab the teaching moment and write posts that will move those low information types back in our direction?

      Consider the optics. What if we successfully demanded a retraction. Can you imagine a Perestroika / anti-DART style series of posts that draw attention to Saez’s retracted post, painting him as a victim? I cannot only imagine it, I believe it would be a quite likely outcome. Sure, all those bloggers would be high information types who already have low thermometer scores. But what impact does that have on the low information types?

      Let’s now add networks. If my belief about the prior distribution of the hypothetical thermometer scores is close to accurate, then it seems unlikely to imagine that the modal network ties of a low information type to anti-data high information types will be greater than the modal network ties of those people to pro-data information types. If that’s the case, then it is unlikely that a retraction would produce the outcome we seek. It should reduce the chances that OgR publishes posts like Saez’s in the future, but the present impact would be the opposite of what we seek.

      Alternatively, if we embrace a poor post like this one as a foil for a teaching opportunity, then we have an opportunity to correct the errors and persuade OgR readers that critics like Saez are misinformed.

      So, those are my two cents. I welcome your thoughts, should you have the time and inclination.

      • Hi Will,

        I actually think we agree on much of this. As you said, we just make different assumptions about the effects of something like this on the practitioner community and, probably, the level of engagement or “teaching opportunity” that Saez – as a practicing academic who identified himself as a quantitative expert – should be awarded for his post.

        I consider oGR to be refereed in the sense that Jim Ron is an academic and the decision to publish or not publish a post submitted to oGR is thus made by an academic peer. I agree with you that, to the best of my knowledge, Jim does not submit things to external peer review. But, Jim’s role, at least in my opinion, makes this a bit different than just an edited clearinghouse of posts. The managing editor (also a published academic) and Jim both made changes to the text of what I had submitted there. These were pretty substantial changes that could be seen as making my piece “punchier.” I wonder whether and to what extent oGR had a say with the title of the Saez piece and with what sentences were highlighted in the post. In my opinion, the clickbait title and highlighted text contribute to the overall negative effect this piece could have on practitioner-academic interactions.

        Anyway, I’ve given Saez’s post way too much thought this week. I look forward to your response to Saez. I will be assigning these posts in my graduate class (and probably my undergraduate class) next year. I think they are a useful look into a very unregulated academic space.

        Best,

        Amanda

  2. I wanted to thank Will Moore for arguing that I should not be silenced. His preference is to call me “ignorant”, my writing “sophomoric”, and my expectations “Pollyannaish”. I guess that it is better to be ridiculed than silenced. So I thank you for your magnanimity.

    If I may, I would like to rebut a couple of point that you have made in your blog. You argue that I am ignorant about the difficulties of coding in human rights datasets. I have developed my own political economy and insurgency datasets and contributed to the development of institutional datasets (most recently for Transparency International). I am fully cognisant about the difficulties of coding data in a dataset. So clearly you are speaking out of ignorance here.

    You also claim that I conclude that data “are pointless and useless”. This is not what I argue in my piece. I argued that the coding scale from 0-2 in the CIRI dataset is too blunt and that it creates a number of econometric problems. In my editorial, I offered a practical and constructive solution (by recoding the existing variables into fuzzy sets) to correct these problems. My proposed solution would offer a more nuanced scale that would assist in the development of more nuanced econometric analysis.

    I think that nuance is as important in data analysis as it is in scholarly debate. When this “ignorant” scholar wrote a “sophomoric” editorial on a nuanced econometric point, I anticipated some negative reactions, this is part of civilised scholarly debate. I am disappointed that this has not been the case. In the responses to my editorial in the Open Democracy site and in your own commentary about me and my work, I see that name-calling and ad hominem attacks are the principal arguments being used to engage with my arguments. I guess that in an era of Internet trolling and Trump-style political dynamics, this type of sophomoric behaviour is to be expected, even in scholarly circles. So I guess that I am a bit Pollyannaish in the expectation that civilised scholarly debate on the quantification of human rights variables is possible.

  3. David Cingranelli says:

    Like you, I am not angry about the Saez’s critique of standards-based human rights data. It is an opportunity for senior scholars who have been doing quantitative work in the field to educate those who are just beginning. However, I do not know where to begin. There is already so much we have written on the subject. Maybe organizing a “human rights and internal conflict” workshop at a future Peace Science meeting would be a good idea. But would junior scholars participate?

  4. James Ron says:

    Dear all, as editor of openGlobalRights, I want to take a moment to explain our editorial policy:

    We are set up as a forum for debate and dialogue that explicitly tries to break down disciplinary boundaries, and purposely invites new perspectives to encourage learning agility among human rights scholars and practitioners. We do edit pieces, but only for clarity.

    Our intention is for each debate to develop organically through multiple perspectives. The learning we seek to encourage is cumulative; it is a function of the exchange of views, rather than of any single post.

    I think the debate people are having now over Lawrence Saez’s piece could be used very productively to explore the opportunities, challenges and limitations of quantitative human rights work.

    Many of our debates are structured similarly and are, I believe, enormously productive. For example, the debate we curated over human rights and religion had many deeply opposing views. I am convinced, however, that anyone who read through that debate would emerge much the wiser, with a far more nuanced understanding.

    As long as commentators remain respectful, these debates can work wonders.

  5. Christopher Fariss says:

    I posted the comment below at http://www.opendemocracy.net. I’m posting it here as well:

    I sent the letter below to James Ron yesterday morning (April 5, 2017). I asked him to post the letter as a new blog entry. His response was that, though they do not post letters to the editor, he would post a blog entry if I wanted to write one. I’m revising some of the key points from the letter into such a document now. The key paragraph from the letter is:

    “The CIRI dataset has its issues, and my concern is not that the article adopts a critical stance toward this data collection project. Rather, the article is highly misleading. The author uses a good deal of statistical jargon, suggesting to readers that the author is an expert on the use of such methods in social science. However, much of the discussion is simply incorrect from a methodological perspective. This is not a matter of judgment, but of fact. ”

    The full letter is below:

    Dear Professor Ron,

    I write to respectfully express deep concern about the recent post “Human rights datasets are pointless without methodological rigour” by Lawrence Saez. Because the post contains numerous factual errors and misrepresents the use of standard statistical tools, it is not a useful focal point for continuing a productive dialogue about how to improve and use human rights measures.

    I study both the production of human rights information and the statistical practice of measurement. In several published and forthcoming articles, I offer critiques of existing human rights data. These critiques include the data from the CIRI project. Though I am critical in these articles, I also offer practical tools that can address bias and measurement error in existing data.

    The CIRI dataset has its issues, and my concern is not that the article adopts a critical stance toward this data collection project. Rather, the article is highly misleading. The author uses a good deal of statistical jargon, suggesting to readers that the author is an expert on the use of such methods in social science. However, much of the discussion is simply incorrect from a methodological perspective. This is not a matter of judgment, but of fact.

    For example, in just one sentence from the article, the author states that “[o]ne of the fundamental expectations from working with parametric data is that it must have equal covariance within variables, namely that the individual values of variables change over time.” This sentence contains three factually incorrect statements: (1) Data are not parametric or non-parametric. Samples of data are however, often modeled using an assumed distribution with estimated parameters. The normal distribution is one such model. (2) Covariance is a probabilistic concept that is statistically estimated to represent the relationship between two random variables and not one within a random variable. Moreover, it is not necessary for the values of a random variable to vary in any particular way. (3) That is, change over time is not a necessary condition for coding realizations of a random variable. A random variable is simply a function from a sample space to the real line, and so variation within is nonsensical. The author of this post has conflated variation among data and the variance of a random variable. Indeed, in probability theory, a random variable can be constant. Overall, the statements by the author are nonsensical and misleading. There are many other examples beyond this single sentence. These must be amended right away.

    Furthermore, I am worried that this article will detract from Open Democracy’s mission to improve human rights around the world by potentially hurting new and future relationships between scholars and practitioners. I think that the article is inaccurate enough to warrant retraction. At the very least, though, I urge you to quickly and definitively post corrections and a statement about the inaccurate statistical claims.

    I do not make these claims lightly and I recognize that they may hurt my relationship with you and your organization. That is not my intention. I just have a deep sense of professional obligation to respond to untruth and lack of fairness in presentation, both of which are on display here.

    I am also prepared and willing to write a point-by-point rebuttal to each of the statistical and data claims that arise in this article. I am available for any additional questions you might have for me.

    Sincerely,
    Christopher J. Fariss
    Assistant Professor, Department of Political Science
    Faculty Associate, Center for Political Studies, Institute for Social Research
    University of Michigan

  6. drdaver68 says:

    I echo David Cingranelli’s sentiments. And I add a dash of regret. When you have a data set bearing your name used by thousands of users in nearly 200 countries over some time period, you face an unimaginable tidal wave of complaints, accusations, etc. Some of these play out in private communiques from states and/or their lawyers, meetings with IOs and NGOs, at conferences and in journals, and in mainstream media outlets. You do get used to criticism of all kinds and intensities. And only you know the full extent of all this.

    My regret is that, being an intensely private person, I turned even further inward to deal with this, and did not go out and engage publicly enough to create the kind of knowledge about the processes and metrology behind CIRI that I should have. So, I’m to some extent to blame for the conditions that give rise to a post like that from Mr. Saez — who wants to engage, but is starting out with a lot of issues that have been long, long, debated, as David C. points out. I wonder whether an APSA shortcourse (or roundtable with data makers, or something like that) on the history (and present) of these issues would be of use to junior scholars and non-specialists.

    Certainly, I was disappointed in aspects of the Saez post. There are the definite technical issues Chad and Chris pointed out. There’s what Will called the “hyperbolic” tone (e.g., calling imperfect data sets “pointless”). And there’s the odd charge that the measures tell one more about the rater than the thing being rated. I’m disappointed because if Mr. Saez wanted to wage a substantial metrological critique of CIRI, he’d have done better going the route of Chris Fariss. If he wanted to wage a critique of CIRI’s information sources, he’d have better going the route of Clark & Sikkink, or Poe & Vasquez or, more recently, Michael Colaresi.

    There are imperfections in all data (despite what some national statisticians claim about their homicide statistics, lol). It’s reasonable to argue about these imperfections. I agree with Amanda, Chad, and Chris that the post is “below the bar” for a premier site like oGR for three reasons. First is because my answer to “Is our understanding of, or dialogue about, the problems of standards-based data enhanced by this posting?” is “no”, for the reason that David C. pointed out that this is all well-trod ground. Second, I also share their concerns about engagement. Right now, I am helping with the implementation component of a new international treaty, and selling states on developing data infrastructures they can trust to report on themselves is dicey, at best. A headline questioning widely-used data as possibly “pointless”, but having insufficient substantive merit to back up that claim, is not helpful. Now, just because that’s a problem for Amanda, Chad or I doesn’t mean it’s Mr. Saez’s problem. However, given the nature of oGR as a connector between academia and policymakers, this has to be a consideration for a site’s moderator in deciding whether to accept a post as-is or requesting greater substantive depth before posting. Finally, CIRI is dead. To me, there’s no point in arguing about dead data and — to the extent people keep arguing about CIRI — future endeavors are jeopardized.

    Sincerely,
    David (the other one)

    • Will H. Moore says:

      Thanks David, I really appreciate you sharing your experience and thoughts here! It reminds me that Christian Davenport and our confederates at Mindfields . And I am thinking that it might be a good idea to do a virtual interview of you and DC about the history/process of CIRI that we record to YouTube.

      • drdaver68 says:

        Thanks, Will. I’d be happy to. Sharing that story about history/process is something I should have done long ago.

  7. K. Chad Clay says:

    I wrote this last night, but following a power outage and an ongoing long delay at the airport, I am just now getting around to posting it. As a result, the conversation here has largely moved beyond this by now, but I leave it here as something of an explanation of my thinking on the topic:

    That’s a great post, Will. Thanks for sharing.

    I would like to share a few of my thoughts about the oGR censorship bit a little though. Here’s my thinking on the topic. I’m fine if someone is out there posting this stuff on their personal blog, but my understanding (which, admittedly, could be wrong) is that oGR’s goal is to encourage informed discussion between academics and practitioners in a way that facilitates increased interaction and cooperation between the two groups.

    I’m currently in the process of recruiting practitioners to work with me on a measurement project, and my experience is that many practitioners are skeptical of cross-national quantitative data (and those of us that produce it). Further, a great many practitioners regularly read oGR; I’ve discussed posts there with human rights advocates on several occasions. You know better than most how sticky ideas are and how difficult it is to convince someone that information is incorrect once it is in a person’s head, especially if they were positively disposed to those ideas in the first place. After those that are skeptical of quantitative data to begin with see something like this posted at a well-regarded outlet (blog or not), they will use it to reinforce their existing beliefs. For those of us trying to engage with them, it makes our jobs harder, as it sets a higher bar for convincing them that we are trustworthy and willing to listen to their concerns.

    So, there’s a trade-off of goals going on here. The original, pre-edited post contained information that was verifiably inaccurate, which was part of my reasoning for the “below the standards of oGR” remark. That information has now been removed from the post. But I also think that, if oGR really wants to generate useful dialog between practitioners and academics on this topic, this is a terrible starting point. I think one can easily recognize that this is a polarizing piece that doesn’t necessarily facilitate a productive conversation.

    Of course, if one just wants to generate clicks, then Saez’s post is great. It just cuts against my understanding of what oGR is trying to do. And maybe it is worth it. I am considering writing a post about the flaws in existing human rights data and what many of us are doing about them, and it sounds like you might do something similar. But I can’t help but wonder if we aren’t facing a steeper uphill climb thanks to the existence of Saez’s post on oGR.

  8. James Ron says:

    Dear all,

    I encourage you to use this opportunity to explore the issues that are important to you, using layperson’s language, and assuming no prior technical knowledge. What cross national data collection and measurement problems have you encountered over the years? What methods are you using to address them now? How can practitioners, donors, and others contribute to, and perhaps learn from and use, your efforts? How feasible is their use, given limited budgets, time, and interest?

    If everyone concerned could contribute a well-written, easy-to-understand, 800-word piece over the coming weeks and months, the folks who read our website will learn a ton. The weight of any single individual post, moreover, will decline.

Leave a comment