An R Utility for UN Treaty Signatory / Ratification

Those of you who do large-N, statistical analysis of human rights might find handy an R utility that Zach Jones (PhD student, Penn State) has written.  Danny Hill alerted me to it, explaining in an email:

It will scrape the UN website and automatically generate panel data on signature and ratification of any UN human rights treaty, even noting accession and succession. You can download it from Zach’s GitHub here: There are also user instructions on that page.

At that github link Zach provides the following description of what you can do with his script:

The R script utilities.R contains a number of functions that make working with the raw data easier. You can load these functions by simply sourcing the file. It requires stringr, lubridate, and plyr.

  • loadData loads the data for a specific treaty given its chapter and treaty numbers, which are passed as strings. You can optionally expand the column names (if needed). If you choose to expand the column names you can also transform the data into a country-year format, given a start year and an end year (both passed as strings).
  • searchTreaties searches through the treaty_name column in index.csv using approximate string matching given a maximum distance (internally it uses agrep). If multiple matches are found, the user can select the best match from the console. The trim option is logical and truncates console output to 80 characters (it is true by default). This function calls loadData internally, and allows overloading, so you can pass arguments to loadData by passing arguments to searchTreaties. Note that you have to name the arguments explicitly (you can’t just use argument ordering).
  • createColumns takes a character vector of dates (or a dataframe with one column) with a trailing type identifier (a one or two character code) and a name for said column. It returns an expanded version of that column with column dimension equal to the number of unique type identifiers plus one (for the no identifier category).
  • expandColumns takes a dataframe that may need to be expanded, passes columns that need to be expanded to createColumns, and combines the results.
  • convertPanel takes a character vector of dates (of the form "%d-%b-%Y") and a year for comparison and returns a binary variable indicating whether the year of the date that was passed is less than or equal to the comparison year.
  • expandPanel takes a dataframe, a start year, and an ending year (both strings), and returns a dataframe with in country-year format with each data column converted into a binary variable.
  • findDates takes a dataframe and finds columns which follow the %d-%b-%Y date format. Optionally allows for dates with a trailing type identifier.

Happy scraping.


About Will H. Moore

I am a political science professor who also contributes to Political Violence @ a Glance and sometimes to Mobilizing Ideas . Twitter: @WilHMoo
This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s