Overview

This notebook was developed to accompany the tutorial of a short course offered at the 2017 Annual Meeting of the American Political Science Association. The instructors for the course are Karsten Donnay (University of Zurich), Eric Dunford (Georgetown University), Andrew Linke (University of Utah), Erin McGrath (University of Maryland), David Backer (University of Maryland), and David Cunningham (University of Maryland). This short course focuses on newly developed software tools designed by the instructors, which enable more effective work with multiple datasets that have geospatial properties, which are increasingly employed in research conducted throughout the social sciences. The aims of the course are to familiarize participants with the use of these tools and associated best practices. At the end of the course, participants should understand why and how they could use these tools to support relevant research that requires integrating datasets with particular geospatial properties.

The first part of the notebook walks through the functionality, applications and best practices of the geomerge package, which was just released. This package has been designed primarily to facilitate addressing challenges related to the integration of datasets with different geospatial properties. The package is illustrated using example data for Nigeria 2011. The illustration covers integration of Polygon, Raster and Point data, including how to generate spatial panel data.

The first part of the notebook walks through the functionality, applications and best practices of the meltt package, which was released earlier this year. This package has been designed to facilitate the integration of event data from multiple sources with differing properties. The package is illustrated by drawing on conflict event data from four prominent event datasets covering conflict observed in Nigeria during 2011.

The tutorial is designed to be hands-on, with participants working through the illustrative examples, accessing and processing datasets using the commands available in the geomerge and meltt packages. Doing so requires, at a minimum, an installation of the R programming software. Some knowledge of R is useful, though not mandatory. During this short course and tutorial, participants should learn about the utility, logic, and functionality of the two packages even without any significant expertise in R.

Before we get started, please set your work directory to the directory into which you unpacked the tutorial files (including the “data” directory).

setwd("YOUR DIRECTORY") 

Part 2: meltt - Matching Event Data by Location, Type and Time

Why We Developed meltt

The use case: In various social science settings, empirical research relies on event data, which seeks to capture information on individual occurrences of phenomena, in a manner that is spatially and temporally disaggregated. Common examples of events for which data are available include incidents of armed conflict, which were discussed earlier (i.e., ACLED), as well as neighborhood crime, terrorist attacks, car accidents, and marathon running times. Event data provide a granular picture of the distributions of locations and timings of a specific phenomenon.

In an increasing number of instances, more than one available event dataset captures the same or related topics. For example, multiple datasets capture similar information on battles between organized armed actors engaged in civil wars. Different event datasets may have information that is valuable, which is not offered by one dataset alone. Integrating event datasets could be useful to bolster spatial and temporal coverage, to encompass a broader spectrum of a phenomenon (i.e., more types), to collate existing information about events (i.e., compile more characteristics and/or more details), and/or to cross-validate the coding of these datasets (i.e., check whether different datasets yield the same measures of a given phenomenon).

The main challenges: Integrating multiple event datasets requires comparison of the entries in those datasets. Comparison is essential to avoid duplicate entries of the same event. Mere pooling of datasets could yield such duplicates, if different datasets happen to record some or all of the same events. Comparison is also necessary to establish matches. Knowing what entries match across multiple event datasets is useful when collating information about events, or when engaging in cross-validation.

Comparing event data, with the goal of establishing which entries do and do not match across multiple datasets, is notoriously difficult for the following reasons:

  • Spatial and Temporal Fuzziness. Information about events can differ depending on the original sources from which event data are derived. For example, news media sources can vary in their reporting of the location and/or timing of an event—especially if precise on-the-ground information is hard to come by. This variation can result in both spatial and temporal fuzziness, where the same event is “measured” with distinct locations and days across different datasets, due to their reliance on different original sources. The fuzziness can be large or small—and is not always consistent even within the same event dataset, again depending on the original sources.

  • Jittering Locations. Different geo-referencing software can produce slightly different longitude and latitude coordinates for the same named place. Those differences result in an artificial geo-spatial “jitter” around the same location, depending on which gazetteer is used in the software.

  • Conceptual Differences. Different event datasets are designed for different reasons. Each dataset will likely reflect a distinctive coding scheme—even for the same specific category of events, let alone for the same general category of events. For example, a dataset recording local muggings and burglaries might have a schema that records these types of events categorically (i.e. “mugging”, “break in”, etc.), whereas another crime dataset might record violent crimes and do so ordinally (1, 2, 3, etc.). Both datasets might be capturing the same event (e.g., a violent mugging), but each has a distinct way of coding that event.

In the past, researchers seeking to overcome these hurdles have typically relied on hand-coding processes to match data. This approach is extremely time consuming and costly, especially to do systematically and on a large scale. At the maximum, each entry in one dataset may need to be compared carefully to each entry in every other dataset of interest. This sort of process requires a lot of meticulous work to yield high-quality, reliable results. In practice, the results can be prone to mistakes, because of differences in what coders see as well as inattentiveness, sloppiness, and other forms of human error. The results of hand-coding are typically hard to reproduce and replicate exactly. For one, doing so requires performing the same comparisons all over again. If done by hand once more, this is time-consuming and costly. Even if automated, the correspondence is unlikely to be perfect. Human coding simply does not ensure 100% consistent output. The performance can be excellent, with clear rules of coding that are strictly observed, though rarely at the level of an automated algorithm.

What meltt Does

meltt provides a tool for integrating multiple event datasets in an automated, fast, inexpensive, flexible, transparent, reproducible fashion. More information about the specifics of the method can be found in the package documentation, which will soon be accompanied by article that describes and applies the methodology.

Using meltt for Research

meltt provides a means of integrating multiple event datasets on the same or similar topics. The output of the package can expand the spatial and temporal scope of coverage, extending analyses in ways that may improve both internal and external validity. The package can be used to integrate data on different types of events, while mitigating against any duplication in records. The output can therefore be valuable and more reliable in studying relationships among types of events. Further, users can rely on the package to collate information on events as recorded in multiple datasets, to enrich the available details. A final benefit is to engage in cross-validation, checking how different datasets measure the same phenomenon.

Installation

The package can also be installed through the CRAN repository.

# install.packages("meltt") 

Again, we recommend that users install the latest development version of the package from Github for the purposes of this tutorial.

devtools::install_github("kdonnay/meltt")

Important: Currently, the package requires that users have both Python (>= 2.7) and a version of the numpy module installed on their computers. To quickly obtain both, install an Anaconda platform. meltt will use these programs in the background.

library(meltt)

Conflict Data

As an illustration, we use several well-established sources of conflict event data, including ACLED, the Uppsala Conflict Data Program’s Georeference Event Data (UCDP-GED), the Social Conflict Analysis Database (SCAD), and the Global Terrorism Database (GTD). Each of these datasets records information about the spatio-temporal occurrence of conflict activity within the country. You downloaded Nigeria_2011.Rdata together with this tutorial file. To create Nigeria_2011.Rdata, we subset entries from the UCDP-GED, ACLED, SCAD, and GTD datasets for Nigeria 2011.

# Load Data
load("data/Nigeria_2011.Rdata")
library(raster)
## Loading required package: sp
# Quick visual overview of ACLED data
plot(states)
plot(SpatialPoints(cbind(acled$LONGITUDE,acled$LATITUDE)),new=TRUE,add=TRUE)

# Quick visual overview of UCPD-GED data
plot(states)
plot(SpatialPoints(cbind(ged$longitude,ged$latitude)),new=TRUE,add=TRUE)

# Quick visual overview of GTD data
plot(states)
plot(SpatialPoints(cbind(gtd$longitude[!is.na(gtd$longitude)],gtd$latitude[!is.na(gtd$longitude)])),new=TRUE,add=TRUE)

# Quick visual overview of SCAD data
plot(states)
plot(SpatialPoints(cbind(scad$longitude,scad$latitude)),new=TRUE,add=TRUE)

Each dataset contains information on the:

  • date: when the event occurred;
  • enddate: if the event occurred across more than one day (i.e., an “episode”);
  • longitude & latitude: geo-location information;
  • event type: the kind of activity for that entry;
  • actor: who initiated the activity.

We will rely on this information to place entries into “bins” for purposes of appropriately and efficiently comparing entries across datasets, ultimately allowing the identification of potential matching entries (i.e., entries that appear to concern the same event). To reiterate, matching can be useful for several reasons. Perhaps most important is to ensure that integration does not lead to duplicate entries within the integrated data. The user may also be interested to collate information on events as recorded in different datasets, or to cross-validate the measurement of events based on the information available in different datasets.

meltt formalizes all input assumptions the user needs to make in order to compare event datasets and identify entries that may match (i.e., concern the same event). First, the user must specify a spatial and temporal window within which any potential match could plausibly fall. That is, how close in space and time do entries need to be to qualify as potentially recording the same event?

Second, to articulate how different coding schemas overlap, the user needs to input an event taxonomy. A taxonomy is a formalization of how variables overlap, moving from as granular as possible to as general as possible. In this case, we are going to explore two taxonomies to help integrate the data: an event taxonomy that generalizes across event types, and an actor taxonomy that generalizes across the various actors located in the data.

Generating a taxonomy

To generate a taxonomy, it must exist across all datasets being integrated. For example, there must be some form of event type variable in each dataset to compare events. Lacking such information simply means the dataset missing the comparable parameter cannot be compared to the other datasets.

For the datasets of interest, we see that each contains information on an event’s type, but that information differs significantly across each dataset, given that each was created for different purposes and that each seeks to capture different types of activities (some of which overlap across data, and some that do not).

For example, observe how the information regarding event type is presented differently across the four datasets.

cat("\n GED \n",
    unique(ged$type_of_violence),
    "\n\n ACLED \n",
    unique(acled$EVENT_TYPE),
    "\n\n GTD \n",
    unique(gtd$attacktype1),
    "\n\n SCAD \n",
    unique(scad$etype)
)
## 
##  GED 
##  3 2 1 
## 
##  ACLED 
##  Strategic development Battle-No change of territory Violence against civilians Remote violence Riots/Protests 
## 
##  GTD 
##  7 2 3 1 6 
## 
##  SCAD 
##  7 4 8 9 3 1 2

The corresponding variable from each dataset records information on the type of event a little differently. The idea of introducing a taxonomy is then, as mentioned before, to generalize across each category by clarifying how each coding scheme maps onto the other.

From the data folder, let’s load an event and actor taxonomy that we already put together for the Nigeria 2011 data.

load("data/taxonomies.Rdata")

A taxonomy allows a researcher to make all assumption regarding how variables map onto each other explicit. Zooming in on the actor taxonomy for the Nigeria 2011 data, we can see that as we move up the taxonomy levels, the more general the bins become. That is, we attempt to be as granular as possible when located the overlap on the first level and then we become more general, ending in just two categories (violent or nonviolent).

# View(event_tax)
event_tax
##    data.source                            base.categories
## 1        acled          Non-violent transfer of territory
## 2        acled           Headquarters or base established
## 3        acled                                   Protests
## 4        acled   Non-violent activity by a conflict actor
## 5        acled                                      Riots
## 6        acled Battle-Non-state actor overtakes territory
## 7        acled        Battle-Government regains territory
## 8        acled              Battle-No change of territory
## 9        acled                            Remote violence
## 10       acled                 Violence against civilians
## 11       acled                      Strategic development
## 12         ged                                          1
## 13         ged                                          2
## 14         ged                                          3
## 15         gtd                                          4
## 16         gtd                                          5
## 17         gtd                                          6
## 18         gtd                                          3
## 19         gtd                                          1
## 20         gtd                                          2
## 21         gtd                                          8
## 22         gtd                                          9
## 23         gtd                                          7
## 24        scad                                          1
## 25        scad                                          2
## 26        scad                                          3
## 27        scad                                          4
## 28        scad                                          5
## 29        scad                                          6
## 30        scad                                          7
## 31        scad                                          8
## 32        scad                                          9
## 33        scad                                         10
## 34        scad                                         -9
##                     Level_1_text                       Level_2_text
## 1            Territorial Dispute              Nonviolent Possession
## 2            Territorial Dispute              Nonviolent Possession
## 3          Protest/Demonstration                Nonviolent Displays
## 4          Protest/Demonstration                Nonviolent Displays
## 5  Violent Protest/Demonstration                   Violent Displays
## 6            Territorial Dispute                 Violent Possession
## 7            Territorial Dispute                 Violent Possession
## 8            Territorial Dispute                     Violent Attack
## 9          Strategic Destruction           Violent Attack (Bombing)
## 10                      Atrocity Violent Attack (Against Civilians)
## 11         Protest/Demonstration                Nonviolent Displays
## 12       Opposition-led Violence                     Violent Attack
## 13       Opposition-led Violence          Violent Attack (No State)
## 14                      Atrocity Violent Attack (Against Civilians)
## 15                      Coercion                 Violent Possession
## 16                      Coercion                 Violent Possession
## 17                      Coercion                 Violent Possession
## 18         Strategic Destruction           Violent Attack (Bombing)
## 19             Strategic Assault                     Violent Attack
## 20             Strategic Assault                     Violent Attack
## 21             Strategic Assault                     Violent Attack
## 22             Strategic Assault                     Violent Attack
## 23         Strategic Destruction                     Violent Attack
## 24         Protest/Demonstration                Nonviolent Displays
## 25         Protest/Demonstration                Nonviolent Displays
## 26         Protest/Demonstration                Nonviolent Displays
## 27         Protest/Demonstration                Nonviolent Displays
## 28 Violent Protest/Demonstration                   Violent Displays
## 29 Violent Protest/Demonstration                   Violent Displays
## 30            State-led Violence                     Violent Attack
## 31       Opposition-led Violence                     Violent Attack
## 32        Within-Regime Violence                     Violent Attack
## 33       Opposition-led Violence          Violent Attack (No State)
## 34            State-led Violence                     Violent Attack
##         Level_3_text     Level_4_text
## 1  Nonviolent Action Nonviolent Event
## 2  Nonviolent Action Nonviolent Event
## 3  Nonviolent Action Nonviolent Event
## 4  Nonviolent Action Nonviolent Event
## 5     Violent Action    Violent Event
## 6     Violent Attack    Violent Event
## 7     Violent Attack    Violent Event
## 8     Violent Attack    Violent Event
## 9     Violent Attack    Violent Event
## 10    Violent Attack    Violent Event
## 11 Nonviolent Action Nonviolent Event
## 12    Violent Attack    Violent Event
## 13    Violent Attack    Violent Event
## 14    Violent Attack    Violent Event
## 15    Violent Action    Violent Event
## 16    Violent Action    Violent Event
## 17    Violent Action    Violent Event
## 18    Violent Attack    Violent Event
## 19    Violent Attack    Violent Event
## 20    Violent Attack    Violent Event
## 21    Violent Attack    Violent Event
## 22    Violent Attack    Violent Event
## 23    Violent Attack    Violent Event
## 24 Nonviolent Action Nonviolent Event
## 25 Nonviolent Action Nonviolent Event
## 26 Nonviolent Action Nonviolent Event
## 27 Nonviolent Action Nonviolent Event
## 28    Violent Action    Violent Event
## 29    Violent Action    Violent Event
## 30    Violent Attack    Violent Event
## 31    Violent Attack    Violent Event
## 32    Violent Attack    Violent Event
## 33    Violent Attack    Violent Event
## 34    Violent Attack    Violent Event

Likewise, we similarly formalized the actor taxonomy in a similar manner.

actor_tax
##    data.source                                         base.categories
## 1         scad                                                Soldiers
## 2         scad                                              Protesters
## 3         scad                                                  Police
## 4         scad                                                  Gunmen
## 5         scad                                               Islamists
## 6         scad                                                   Women
## 7         scad                                              Christians
## 8         scad                                                  Youths
## 9         scad                                                 Unknown
## 10        scad                                                 Pirates
## 11        scad                                               Civilians
## 12        scad                                                 Muslims
## 13        scad                                                  Fulani
## 14        scad                                          Unknown gunmen
## 15        scad                                              Boko Haram
## 16        scad                                       Unknown attackers
## 17        scad                                         Unknown bombers
## 18        scad                                           Muslim youths
## 19        scad                                 Nigeria Labour Congress
## 20        scad                                               Militants
## 21        scad                                             Muslim sect
## 22        scad                                         Chistian youths
## 23        scad                                         Cattle rustlers
## 24        scad                                            Muslim gangs
## 25        scad                                          Fulani Muslims
## 26        scad                                      Unknown assailants
## 27        scad                          Supporters of Muhammadu Buhari
## 28        scad                                        Muslim attackers
## 29        scad                                        Copy cat killers
## 30        scad                                      Suspected militant
## 31        scad                                     Suspected militants
## 32        scad                                         Gang of robbers
## 33        scad                                            Umar Quality
## 34        scad                     National Union of Electrity Workers
## 35        scad                                 Muslim Fulani tribesmen
## 36        scad                                 Oodua People's Congress
## 37        scad                                          Ezza community
## 38        scad                                              Boko Haram
## 39        scad                                              Boko Haram
## 40        scad                                              Boko Haram
## 41        scad                                              Boko Haram
## 42        scad                                              Boko Haram
## 43        scad                                              Boko Haram
## 44        scad                                              Boko Haram
## 45        scad                                              Boko Haram
## 46       acled                  Military Forces of Nigeria (1999-2015)
## 47       acled                                              Boko Haram
## 48       acled                                              Boko Haram
## 49       acled                      Unidentified Armed Group (Nigeria)
## 50       acled                                       Rioters (Nigeria)
## 51       acled                                    Protesters (Nigeria)
## 52       acled                           PDP: Peoples Democratic Party
## 53       acled Military Forces of Nigeria (1999-2015) Joint Task Force
## 54       acled                    Police Forces of Nigeria (1999-2015)
## 55       acled                                Muslim Militia (Nigeria)
## 56       acled                         Fulani Ethnic Militia (Nigeria)
## 57       acled                             Christian Militia (Nigeria)
## 58       acled                           Ezza Ethnic Militia (Nigeria)
## 59       acled                             Muslim Youth Sect (Nigeria)
## 60       acled                          Christian Youth Sect (Nigeria)
## 61       acled                           DDM: Delta Democratic Militia
## 62       acled                       National Youth Council of Nigeria
## 63       acled                                              Boko Haram
## 64         ged                                       Supporters of ACN
## 65         ged                                    Christians (Nigeria)
## 66         ged                                      Supporters of ANPP
## 67         ged                                   Government of Nigeria
## 68         ged                                                   Hausa
## 69         ged                                               Black Axe
## 70         ged                                   Government of Nigeria
## 71         ged                                                  Deebam
## 72         ged                                                  Fulani
## 73         ged                                            Greenlanders
## 74         ged              Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 75         ged                                                   Birom
## 76         ged                                                   Ezilo
## 77         ged                                  Government of Cameroon
## 78         ged                                  Government of Cameroon
## 79         ged                                              Boko Haram
## 80         ged                                         NURTW-Auxiliary
## 81         gtd                                                 Unknown
## 82         gtd                                                 Pirates
## 83         gtd                                                 Muslims
## 84         gtd                                              Protesters
## 85         gtd                                                  Gunmen
## 86         gtd                                                  Youths
## 87         gtd                                               Militants
## 88         gtd                                              Boko Haram
## 89         gtd                                Delta Democratic Militia
## 90         gtd    Ansaru (Jama'atu Ansarul Muslimina Fi Biladis Sudan)
## 91       acled                          Bajju Ethnic Militia (Nigeria)
##                     Level_1          Level_2           Level_3
## 1                government   violent groups    violent groups
## 2           movement groups political groups nonviolent groups
## 3                government   violent groups    violent groups
## 4            violent groups   violent groups    violent groups
## 5          religious groups  civilian groups nonviolent groups
## 6                 civilians  civilian groups nonviolent groups
## 7             ethnic groups  civilian groups nonviolent groups
## 8                 civilians  civilian groups nonviolent groups
## 9                   unknown          unknown    violent groups
## 10           violent groups   violent groups    violent groups
## 11                civilians  civilian groups nonviolent groups
## 12            ethnic groups  civilian groups nonviolent groups
## 13            ethnic groups  civilian groups nonviolent groups
## 14           violent groups   violent groups    violent groups
## 15                     torg   violent groups    violent groups
## 16           violent groups   violent groups    violent groups
## 17           violent groups   violent groups    violent groups
## 18                civilians  civilian groups nonviolent groups
## 19          civilian groups  civilian groups nonviolent groups
## 20           violent groups   violent groups    violent groups
## 21            ethnic groups  civilian groups nonviolent groups
## 22                civilians  civilian groups nonviolent groups
## 23          civilian groups  civilian groups nonviolent groups
## 24           violent groups   violent groups    violent groups
## 25            ethnic groups  civilian groups nonviolent groups
## 26           violent groups   violent groups    violent groups
## 27          movement groups political groups nonviolent groups
## 28           violent groups   violent groups    violent groups
## 29           violent groups   violent groups    violent groups
## 30           violent groups   violent groups    violent groups
## 31           violent groups   violent groups    violent groups
## 32           violent groups   violent groups    violent groups
## 33                  unknown          unknown    violent groups
## 34 nonviolent organizations  civilian groups nonviolent groups
## 35                civilians  civilian groups nonviolent groups
## 36                civilians  civilian groups nonviolent groups
## 37                civilians  civilian groups nonviolent groups
## 38           violent groups   violent groups    violent groups
## 39           violent groups   violent groups    violent groups
## 40           violent groups   violent groups    violent groups
## 41           violent groups   violent groups    violent groups
## 42                     torg   violent groups    violent groups
## 43           violent groups   violent groups    violent groups
## 44           violent groups   violent groups    violent groups
## 45           violent groups   violent groups    violent groups
## 46           violent groups   violent groups    violent groups
## 47           violent groups   violent groups    violent groups
## 48           violent groups   violent groups    violent groups
## 49           violent groups   violent groups    violent groups
## 50           violent groups   violent groups    violent groups
## 51          movement groups political groups nonviolent groups
## 52                civilians  civilian groups nonviolent groups
## 53           violent groups   violent groups    violent groups
## 54               government   violent groups    violent groups
## 55           violent groups   violent groups    violent groups
## 56           violent groups   violent groups    violent groups
## 57           violent groups   violent groups    violent groups
## 58           violent groups   violent groups    violent groups
## 59                civilians  civilian groups nonviolent groups
## 60                civilians  civilian groups nonviolent groups
## 61           violent groups   violent groups    violent groups
## 62               government   violent groups    violent groups
## 63           violent groups   violent groups    violent groups
## 64           violent groups   violent groups    violent groups
## 65           violent groups   violent groups    violent groups
## 66           violent groups   violent groups    violent groups
## 67               government   violent groups    violent groups
## 68           violent groups   violent groups    violent groups
## 69           violent groups   violent groups    violent groups
## 70           violent groups   violent groups    violent groups
## 71           violent groups   violent groups    violent groups
## 72           violent groups   violent groups    violent groups
## 73           violent groups   violent groups    violent groups
## 74           violent groups   violent groups    violent groups
## 75           violent groups   violent groups    violent groups
## 76           violent groups   violent groups    violent groups
## 77               government   violent groups    violent groups
## 78           violent groups   violent groups    violent groups
## 79           violent groups   violent groups    violent groups
## 80                     torg   violent groups    violent groups
## 81           violent groups   violent groups    violent groups
## 82           violent groups   violent groups    violent groups
## 83           violent groups   violent groups    violent groups
## 84           violent groups   violent groups    violent groups
## 85           violent groups   violent groups    violent groups
## 86           violent groups   violent groups    violent groups
## 87           violent groups   violent groups    violent groups
## 88                     torg   violent groups    violent groups
## 89                     torg   violent groups    violent groups
## 90                     torg   violent groups    violent groups
## 91           violent groups   violent groups    violent groups

Generally, specifications of taxonomy levels can be as granular or as broad as one chooses. The more fine-grained the levels one includes to describe the overlap, the more specific the match. At the same time, if categories are too narrow, it is difficult to conceptualize potential matches across datasets. As a rule, there is a tradeoff between specific categories that can better differentiate among possible duplicate entries and unspecific categories that more easily recognize potentially matching information across datasets.

As a general rule, we therefore recommend to include, whenever it is conceptually warranted, both specific fine-grained categories and a few increasingly broader ones. In this case, meltt will have more information to work with when differentiating between sets of potential matches. In establishing which entries are most likely to correspond, meltt in case of more than two potential matches in one dataset always automatically favors the one that more precisely corresponds. A good taxonomy is the key to matching data, and is the primary vehicle by which a user’s assumptions – regarding how data fits together – is made transparent.

Preparing the data for integration

There are a few technical details regarding how the data must be organized to submit as arguments in meltt.

  • Taxonomies must be organized as lists: each taxonomy data.frame is read into the taxonomy argument of meltt as part of a single list object.
taxonomies = list("event_tax" = event_tax,
                  "actor_tax" = actor_tax)
str(taxonomies)
## List of 2
##  $ event_tax:'data.frame':   34 obs. of  6 variables:
##   ..$ data.source    : chr [1:34] "acled" "acled" "acled" "acled" ...
##   ..$ base.categories: chr [1:34] "Non-violent transfer of territory" "Headquarters or base established" "Protests" "Non-violent activity by a conflict actor" ...
##   ..$ Level_1_text   : chr [1:34] "Territorial Dispute" "Territorial Dispute" "Protest/Demonstration" "Protest/Demonstration" ...
##   ..$ Level_2_text   : chr [1:34] "Nonviolent Possession" "Nonviolent Possession" "Nonviolent Displays" "Nonviolent Displays" ...
##   ..$ Level_3_text   : chr [1:34] "Nonviolent Action" "Nonviolent Action" "Nonviolent Action" "Nonviolent Action" ...
##   ..$ Level_4_text   : chr [1:34] "Nonviolent Event" "Nonviolent Event" "Nonviolent Event" "Nonviolent Event" ...
##  $ actor_tax:'data.frame':   91 obs. of  5 variables:
##   ..$ data.source    : chr [1:91] "scad" "scad" "scad" "scad" ...
##   ..$ base.categories: chr [1:91] "Soldiers" "Protesters" "Police" "Gunmen" ...
##   ..$ Level_1        : chr [1:91] "government" "movement groups" "government" "violent groups" ...
##   ..$ Level_2        : chr [1:91] "violent groups" "political groups" "violent groups" "violent groups" ...
##   ..$ Level_3        : chr [1:91] "violent groups" "nonviolent groups" "violent groups" "violent groups" ...
  • Taxonomies must be named the same as the variables they seek to describe: meltt relies on simple naming conventions to identify which variable is what when matching.
names(taxonomies)
## [1] "event_tax" "actor_tax"

In this case, let’s rename the variables in the data to correspond with the naming convention of the taxonomies that we designated.

# for events
names(ged)[names(ged)=='type_of_violence'] = 'event_tax'
names(acled)[names(acled)=='EVENT_TYPE'] = 'event_tax'
names(scad)[names(scad)=='etype'] = 'event_tax'
names(gtd)[names(gtd)=='attacktype1'] = 'event_tax'

# for actors
names(ged)[names(ged)=='side_a'] = 'actor_tax' # given UCPD dyad conventions
names(acled)[names(acled)=='ACTOR1'] = 'actor_tax'
names(scad)[names(scad)=='actor1'] = 'actor_tax'
names(gtd)[names(gtd)=='gname'] = 'actor_tax'
  • Each taxonomy must contain a data.source and base.categories column: this last convention helps meltt identify which variable is contained in which data object. The data.source column should reflect the names of the of the data objects for input data and the base.categories should reflect the original coding of the variable on which the taxonomy is built.
head( event_tax[, c( "data.source","base.categories" ) ] )
##   data.source                            base.categories
## 1       acled          Non-violent transfer of territory
## 2       acled           Headquarters or base established
## 3       acled                                   Protests
## 4       acled   Non-violent activity by a conflict actor
## 5       acled                                      Riots
## 6       acled Battle-Non-state actor overtakes territory
  • Each input dataset must contain a date,enddate (if one exists), longitude, and latitude column: the variables must be named accordingly (no deviations in naming conventions are allowed). The dates should be in an R date format (as.Date()), and the geo-reference information must be numeric (as.numeric()).

As you might have already realized from looking at the data, they are not perfectly organized in this way, so we will need to do a little cleaning prior to processing.

# Cleaning UCDP-GED
ged$date_start = as.Date(ged$date_start)
names(ged)[names(ged)=='date_start'] = 'date'
ged$date_end = as.Date(ged$date_end)
names(ged)[names(ged)=='date_end'] = 'enddate'

# Cleaning ACLED
colnames(acled)  = tolower(colnames(acled))
acled$event_date = as.Date(acled$event_date)
names(acled)[names(acled)=='event_date'] = 'date'

# Cleaning GTD
gtd$date = as.Date(paste0(gtd$iyear,"-",gtd$imonth,"-",gtd$iday))

# Time and Location in formation must be complete. Cannot process entries 
# missing this information. Here GTD is missing lat/lon information for one
# entry; thus, we drop it.
gtd = gtd[!is.na(gtd$latitude),]

# Cleaning SCAD
scad$startdate = as.Date(scad$startdate)
names(scad)[names(scad)=='startdate'] = 'date'
scad$enddate = as.Date(scad$enddate)

Lastly, the ACLED data codes protest and riots into a single category. We opt to disaggregate this level further by breaking the event type up into “Protests” and “Riots” categories, using the number of reported fatalities as the delimiter.

acled$event_tax[acled$event_tax == "Riots/Protests" & 
                   (acled$fatalities==0 | 
                   is.na(acled$fatalities))] = "Protests"
acled$event_tax[acled$event_tax == "Riots/Protests" & acled$fatalities>0] = "Riots"

Matching Data

Once the taxonomy is formalized, matching several datasets is straightforward. The meltt() function takes four main arguments:

  • ...: input data;
  • taxonomies =: list object containing the user-input taxonomies;
  • spatwindow =: the spatial window (in kilometers);
  • twindow =: the temporal window (in days).

Below we assume that any two events among the four different datasets occurring within 3 kilometers and 1 days of each other could plausibly be the same event. This “fuzziness” basically sets the boundaries on how precise we believe the spatial location and timing of events is coded. It is usually best practice to vary these specifications systematically to ensure that no one specific combination drives the outcomes of the integration task.

We then assume that event categories map onto each other according to the way that we formalized in the taxonomies outlined above. We fold all this information together using the meltt() function and then store the results in an object named output.

output <-  meltt(acled,ged,scad,gtd,
                 taxonomies = taxonomies,
                 twindow = 1,spatwindow = 3)
##  meltt: Matching Event Data by Location, Time and Type.
##  Karsten Donnay and Eric Dunford, 2018
## 
##  NOTE: Depending on the size and number of datasets integration may take some time!
## 
## 
##  meltt(acled, ged, scad, gtd, taxonomies = taxonomies, twindow = 1, 
##     spatwindow = 3)
## 
##  Checking meltt() arguments and inputs: Done.
## 
## Please note the following:
## 
##   One or more of the input datasets contains episodal data but no 'enddate' varible was specified for dataset 'acled'. If an end date variable exists, please relabel as 'enddate'.
## 
##   One or more of the input datasets contains episodal data but no 'enddate' varible was specified for dataset 'gtd'. If an end date variable exists, please relabel as 'enddate'.
##  Preparing data for integration: Done.
##  Integrating dataset 1 with dataset 2: Done.
##  Integrating merged data and dataset 3: Done.
##  Integrating merged data and dataset 4: Done.
##  Integration completed!

The above message notes that two of the datasets (ACLED and GTD) do not have enddate. That is, they do not contain episodal data. MELTT will create a placeholder for the enddate that mirrors the date.

meltt also contains a range of adjustments to offer the user additional controls regarding how the events are matched. These auxiliary arguments are:

  • smartmatch: when TRUE (default), all available taxonomy levels are used and meltt uses a matching score that ensures that fine-grained agreements is favored over broader agreement, if more than one taxonomy level exists. When FALSE, only specific taxonomy levels are considered.
  • certainty: specification of the exact taxonomy level to match on when smartmatch = FALSE.
  • partial: specifies whether matches along only some of the taxonomy dimensions are permitted.
  • averaging: implement averaging of all values events are match on when matching across multiple data.frames. That is, as events are matched dataset by dataset, the metadata is averaged. (Note: that this can generate distortion in the output).
  • weight: specified weights for each taxonomy level to increase or decrease the importance of each taxonomy’s contribution to the matching score.

At times, one might want to know which taxonomy level is doing the heavy lifting. By turning off smartmatch, and specifying certain taxonomy levels by which to compare events, or by weighting taxonomy levels differently, one is able to better assess which assumptions are driving the final integration results. This can help with fine-tuning the input assumptions for meltt to gain the most valid match possible.

Output

When printed, the meltt object offers a brief summary of the output.

output
## MELTT Complete: 4 datasets successfully integrated.
## =========================================================
## Total No. of Input Observations:                  915
## No. of Unique Obs (after deduplication):          691
## No. of Unique Matches:                            150
## No. of Duplicates Removed:                        224
## =========================================================

In matching the four conflict datasets, there are 915 total entries. Of those, 151 of them are events contained within two or more datasets based on their timestamp, location and event characteristics (as expressed by the taxonomies). As such, MELTT removes 225 entries that are found to be duplicates, leaving us with 690 “unique” entries.

Likewise, the summary() function offers a more informed summary of the output.

summary(output)
## 
## MELTT output
## ============================================================
## No. of Input Datasets: 4
## Data Object Names: acled, ged, scad, gtd
## Spatial Window: 3km
## Temporal Window: 1 Day(s)
## 
## No. of Taxonomies: 2
## Taxonomy Names: event_tax, actor_tax
## Taxonomy Depths: 4, 3
## 
## Total No. of Input Observations:                  915
## No. of Unique Matches:                            150
##   - No. of Event-to-Event Matches:                150
##   - No. of Episode-to-Episode Matches:            0
## No. of Duplicates Removed:                        224
## No. of Unique Obs (after deduplication):          691
## ------------------------------------------------------------
## Summary of Overlap
##  acled ged scad gtd Freq
##      X               224
##          X           141
##               X       90
##                   X   86
##      X   X            43
##      X        X        7
##      X            X   24
##          X    X        4
##          X        X    6
##               X   X    3
##      X   X    X        8
##      X   X        X   34
##      X        X   X    6
##          X    X   X    4
##      X   X    X   X   11
## ============================================================
## *Note: 40 episode(s) flagged as potentially matching to an event.
## Review flagged match with meltt.inspect()

Given that meltt objects can be saved and referenced later, the summary function offers a recap on the input parameters and assumptions that underpin the match (i.e. the datasets, the spatiotemporal window, the taxonomies, etc.). Again, information regarding the total number of observations, the number of unique and duplicate entries, and the number matches found is reported, but this time information regarding how many of those matches were event-to-event (i.e. events that played out along one time unit where the date is equal to the end date) and episode-to-episode (i.e. events that played out over a couple of days).

A summary of overlap is also provided, articulating how the different input datasets overlap and where. For example, only 11 entries appear in all four datasets, while 4 entries are found to match across GED/SCAD/GTD, 6 across ACLED/SCAD/GTD, and 34 across ACLED/GED/GTD.

Note: events that have been flagged as matching to episodes require manual review using the meltt.inspect() function. The summary output tells us that 40 episodes are flagged as potentially matching to some event. Technically speaking, episodes (events with different start and end dates) and events are at different units of analysis; thus, user discretion is required to help sort out these types of matches. The meltt.inspect() function eases this process of manual assessment. See below for more information.

Visualization

For quick visualizations of the matched output, meltt contains three plotting functions.

plot() offers a bar plot that graphically articulates the unique and overlapping entries. Note that the entries from the leading dataset (i.e. the dataset first entered into meltt) is all black. In this representation, all matching (or duplicate) entries are expressed in reference to the datasets that came before it. Any match found in GED is with respect to ACLED, any in SCAD with respect to ACLED and GED, and so forth. As such, the leading data set is always in black.

plot(output)
## Warning: attributes are not identical across measure variables;
## they will be dropped

tplot() offers a time series plot of the meltt output. The plot works as a reflection, where raw counts of the unique entries are plotted right-side up and the raw counts of the removed duplicates are plotted below it. This offers a quick snapshot of when duplicates are found. Temporal clustering of duplicates may indicate an issue with the data and/or the input assumptions, or it’s potentially evidence of a unique artifact of the data itself.

Users can specify the temporal unit that the data should be binned (day, week, month, year). Give that the data only covers one month, we’ll look at the output by day.

tplot(output, time_unit = "months")

Similarly, mplot() presents a summary of the spatial distribution of the data by mapping the spatial points. Events where matches were detected are labeled by blue circles. Again, the goal is to get a sense of the spatial distribution of the matches to both identify any clustering/disproportionate coverage in where matches are located, and to also get a sense of the spread of the integrated output.

mplot(output)

The mplot() command, in fact, opens an interactive data browser in the viewer window allowing a granular inspection of the spatial matches. Information regarding the input criteria in which each entry was assessed (e.g. the taxonomy inputs) are retained and can be referenced by hovering over the point with the mouse.

Extracting Data

Grabbing the De-Duplicated Data

meltt provides two methods for extracting data from the output object.

meltt.data() returns the de-duplicated data along with any necessary columns the user might need. This is the primary function for extracting matched data and moving on with subsequent analysis. The columns = argument takes any vector of variable names and returns those variables in the output. If no variables are specified, meltt returns the spatio-temporal and taxonomy variables that were employed during the match. In addition, the function returns a unique event and data ID for reference.

uevents <- meltt_data(output)
head(uevents)
##   dataset event       date latitude longitude                  event_tax
## 1   acled     1 2011-01-01  9.92850  8.892100      Strategic development
## 2     gtd     1 2011-01-01 11.83333 13.150000                          7
## 3   acled    24 2011-01-03 11.84640 13.160300 Violence against civilians
## 4   acled    25 2011-01-03 11.84640 13.160300 Violence against civilians
## 5     gtd     3 2011-01-03  5.50000  5.983333                          3
## 6   acled    30 2011-01-04  9.28330 12.466700                   Protests
##                            actor_tax
## 1                         Boko Haram
## 2                         Boko Haram
## 3 Unidentified Armed Group (Nigeria)
## 4                         Boko Haram
## 5           Delta Democratic Militia
## 6                  Rioters (Nigeria)

The number of entries in this data frame corresponds with the number of de-duplicated entries in the data.

dim(uevents)
## [1] 691   7

In addition, we can extract specific columns of data by using the columns= argument. Below we extract all the event summary columns for the four datasets to retrieve the qualitative descriptions of the reported events.

uevents2 <- meltt_data(output,
                       columns = c("date","event_tax","longitude","latitude",
                                   "notes","summary","source_headline","issuenote"))
head(uevents2)
##   dataset event       date latitude longitude                  event_tax
## 1   acled     1 2011-01-01  9.92850  8.892100      Strategic development
## 2     gtd     1 2011-01-01 11.83333 13.150000                          7
## 3   acled    24 2011-01-03 11.84640 13.160300 Violence against civilians
## 4   acled    25 2011-01-03 11.84640 13.160300 Violence against civilians
## 5     gtd     3 2011-01-03  5.50000  5.983333                          3
## 6   acled    30 2011-01-04  9.28330 12.466700                   Protests
##                                                                                                                                                                                                                                                  notes
## 1 Suspected Boko Haram arsonists burnt a church in a northern Nigerian city. Arsonists Saturday night who set a fire on the church that gutted a section of it before the fire was put out by residents. No one was hurt in the attack as there were n
## 2                                                                                                                                                                                                                                                 <NA>
## 3                              Gunmen killed three people at a movie theatre in a northern city in an attack police believe is politically-motivated ahead of general elections. The assailants were believed to be thugs loyal to a local politician.
## 4 Suspected members of a radical Islamist sect blamed for a spate of recent attacks in northern Nigeria shot dead an off-duty policeman in Maiduguri. The victim was wearing civilian clothes and was about to enter his home when the attack took pla
## 5                                                                                                                                                                                                                                                 <NA>
## 6                                                                               A riot broke out at Jimeta Prison complex when suspected Boko Haram inmates attempted a prison break by overpowering guards. The attempted break-out was unsuccessful.
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              summary
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               <NA>
## 2                                                                                                                                                                        01/01/2011: On Saturday night, in Maiduguri, Borno, Nigeria, unidentified gunmen set fire to Victory Christ Church by unknown means. No casualties were reported and one section of the church was damaged. No group claimed responsibility, but the militant group Boko Haram was thought to be responsible for the attack
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               <NA>
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               <NA>
## 5 01/03/2011: On Monday night around 0100, in Ughelli, Delta, Nigeria, three people were injured when unidentified militants detonated an improvised explosive device targeting the Independent National Electoral Commission (INEC) office building on Post Officer Road. The building was burned and completely destroyed in the attack. The attack was motivated by the INEC's rigging of the recent election. The militant group Delta Democratic Militia claimed responsibility for the attack.
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               <NA>
##   source_headline issuenote
## 1            <NA>      <NA>
## 2            <NA>      <NA>
## 3            <NA>      <NA>
## 4            <NA>      <NA>
## 5            <NA>      <NA>
## 6            <NA>      <NA>

Note that there is some overlap in the descriptions across datasets. These are events that have been matched up. The information from the original dataset can be retrieved, even if the entry itself has been flagged as a duplicate and removed. In this way, meltt operates as a sophisticated merge function.

Grabbing the Duplicates

meltt.duplicates(), on the other hand, returns a data frame of all events that matched up. This provides a quick way of examining and assessing the events that matched. Since the quality of any match is only as good as the assumptions we input, its key that the user qualitatively evaluate the meltt output to assess whether any assumptions should be adjusted. Like meltt.data(), the columns = argument can be customized to return variables of interest.

Note that the data is presented differently than in meltt.data(); here each dataset (and its corresponding variables) is presented in a separate column. This representation is chose for ease of comparison. The requested columns are intended to assist with validation.

Below we do not specify specific columns. As such, all columns (with unique IDs on each variable) are returned. This returns a wide dataset.

dups <- meltt_duplicates(output)
head(dups)
##   acled_dataset acled_event ged_dataset ged_event scad_dataset scad_event
## 1             0           0           2        66            3        105
## 2             0           0           2       194            0          0
## 3             0           0           2       160            3         21
## 4             0           0           0         0            3        123
## 5             0           0           2        70            0          0
## 6             0           0           2        67            3        110
##   gtd_dataset gtd_event  gtd_eventid gtd_iyear gtd_imonth gtd_iday
## 1           0         0           NA        NA         NA       NA
## 2           4       114 201110030005      2011         10        3
## 3           4        32 201103020016      2011          3        2
## 4           4       163 201112150028      2011         12       15
## 5           4       156 201111270022      2011         11       27
## 6           4       143 201111090012      2011         11        9
##   gtd_approxdate gtd_extended gtd_resolution gtd_country gtd_country_txt
## 1           <NA>           NA           <NA>          NA            <NA>
## 2           <NA>            0           <NA>         147         Nigeria
## 3           <NA>            0           <NA>         147         Nigeria
## 4           <NA>            0           <NA>         147         Nigeria
## 5           <NA>            0           <NA>         147         Nigeria
## 6           <NA>            0           <NA>         147         Nigeria
##   gtd_region     gtd_region_txt gtd_provstate  gtd_city gtd_latitude
## 1         NA               <NA>          <NA>      <NA>           NA
## 2         11 Sub-Saharan Africa         Borno Maiduguri     11.83333
## 3         11 Sub-Saharan Africa         Niger    Suleja      9.18053
## 4         11 Sub-Saharan Africa         Borno Maiduguri     11.83321
## 5         11 Sub-Saharan Africa         Borno Maiduguri     11.83612
## 6         11 Sub-Saharan Africa         Borno    Mainok     11.82986
##   gtd_longitude gtd_specificity gtd_vicinity
## 1            NA              NA           NA
## 2      13.15000               1            0
## 3       7.17934               1            0
## 4      13.15010               1            0
## 5      13.17764               1            0
## 6      12.63022               1            0
##                                                                    gtd_location
## 1                                                                          <NA>
## 2                                   At a tea shop in Maiduguri, Borno, Nigeria.
## 3                              At a secondary school in Suleja, Niger, Nigeria.
## 4                                                                          <NA>
## 5                                                                   Gwange ward
## 6 On the outskirts of Maiduguri, approximately 75 kilometers from Damaturu city
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   gtd_summary
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <NA>
## 2                                                                                                                                                                                                                                                                       10/03/2011: On Monday morning, in Maiduguri, Borno, Nigeria, two militants fired upon and killed a tea seller and a civilian outside a tea shop. No group has claimed responsibility, but the militant group Boko Haram was thought to be responsible for the attack.
## 3                                                                             03/02/2011: On Wednesday afternoon around 1330, in Suleja, Niger, Nigeria, 10 people were killed and 34 others were injured when one man threw an improvised explosive device at a Peoples Democratic Party campaign rally for Niger Governor Babangida Aliyu, at a Nigerian government secondary school. Babangida Aliyu was not injured, but one bus sustained an unknown amount of damage in the attack. No group has claimed responsibility for the attack.
## 4                                                                                                                                                                                 12/15/2011: Suspected members of Boko Haram opened fire on a group of civilians standing outside of a shop in Maiduguri city, Borno state, Nigeria.  Five civilians were killed in the shooting; however, there were no reported injuries.  The assailants were traveling in a vehicle at the time of the attack and fled the scene following the shooting.
## 5 11/27/2011: Three unidentified gunmen shot and killed a government employee in Gwange ward of Maiduguri city, Borno state, Nigeria.  The victim, Kala Boro, was a protocol officer for the Borno state Government House.  The assailants followed him home from work and shot him while he was in his car.  This was one of two multiple incidents; the assailants killed an herbalist in a separate incident after killing Boro.  No group claimed responsibility for the incident; however, sources suspect the involvement of Boko Harm.
## 6                                                                                                            11/9/2011: Approximately 20 suspected members of Boko Haram attacked a police station in Mainok village, Borno state, Nigeria.  The assailants threw explosives inside and burned the police station down; there were no reported injuries as the police station had been closed some time before.  The attack on the police station happened in conjunction with an attack on a federal road safety office in the same village.
##   gtd_crit1 gtd_crit2 gtd_crit3 gtd_doubtterr gtd_alternative
## 1        NA        NA        NA            NA              NA
## 2         1         1         1             0              NA
## 3         1         1         1             0              NA
## 4         1         1         1             0              NA
## 5         1         1         1             0              NA
## 6         1         1         1             0              NA
##   gtd_alternative_txt gtd_multiple gtd_success gtd_suicide gtd_event_tax
## 1                <NA>           NA          NA          NA            NA
## 2                <NA>            0           1           0             2
## 3                <NA>            0           1           0             3
## 4                <NA>            0           1           0             2
## 5                <NA>            1           1           0             2
## 6                <NA>            1           1           0             3
##   gtd_attacktype1_txt gtd_attacktype2 gtd_attacktype2_txt gtd_attacktype3
## 1                <NA>              NA                <NA>            <NA>
## 2       Armed Assault              NA                <NA>            <NA>
## 3   Bombing/Explosion              NA                <NA>            <NA>
## 4       Armed Assault              NA                <NA>            <NA>
## 5       Armed Assault              NA                <NA>            <NA>
## 6   Bombing/Explosion              NA                <NA>            <NA>
##   gtd_attacktype3_txt gtd_targtype1           gtd_targtype1_txt
## 1                <NA>            NA                        <NA>
## 2                <NA>            14 Private Citizens & Property
## 3                <NA>            14 Private Citizens & Property
## 4                <NA>            14 Private Citizens & Property
## 5                <NA>             2        Government (General)
## 6                <NA>             3                      Police
##   gtd_targsubtype1                              gtd_targsubtype1_txt
## 1               NA                                              <NA>
## 2               77           Laborer (General)/Occupation Identified
## 3               67                      Unnamed Civilian/Unspecified
## 4               74                          Marketplace/Plaza/Square
## 5               18 Government Personnel (excluding police, military)
## 6               22   Police Building (headquarters, station, school)
##                      gtd_corp1
## 1                         <NA>
## 2                         <NA>
## 3                         <NA>
## 4                    Civilians
## 5 Borno state Government House
## 6                       Police
##                                                          gtd_target1
## 1                                                               <NA>
## 2                           A tea seller was targeted in the attack.
## 3                                                          Civilians
## 4                   Civilians grouped outside of a shop in Maiduguri
## 5 Kala Boro, a protocol officer for the Borno state Government House
## 6                      An abandoned police outpost in Mainok village
##   gtd_natlty1 gtd_natlty1_txt gtd_targtype2    gtd_targtype2_txt
## 1          NA            <NA>            NA                 <NA>
## 2         147         Nigeria            NA                 <NA>
## 3         147         Nigeria             2 Government (General)
## 4         147         Nigeria            NA                 <NA>
## 5         147         Nigeria            NA                 <NA>
## 6         147         Nigeria            NA                 <NA>
##   gtd_targsubtype2                                 gtd_targsubtype2_txt
## 1               NA                                                 <NA>
## 2               NA                                                 <NA>
## 3               15 Politician or Political Party Movement/Meeting/Rally
## 4               NA                                                 <NA>
## 5               NA                                                 <NA>
## 6               NA                                                 <NA>
##          gtd_corp2                                          gtd_target2
## 1             <NA>                                                 <NA>
## 2             <NA>                                                 <NA>
## 3 Niger Government The Niger state governor was targeted in the attack.
## 4             <NA>                                                 <NA>
## 5             <NA>                                                 <NA>
## 6             <NA>                                                 <NA>
##   gtd_natlty2 gtd_natlty2_txt gtd_targtype3 gtd_targtype3_txt gtd_targsubtype3
## 1          NA            <NA>          <NA>              <NA>             <NA>
## 2          NA            <NA>          <NA>              <NA>             <NA>
## 3         147         Nigeria          <NA>              <NA>             <NA>
## 4          NA            <NA>          <NA>              <NA>             <NA>
## 5          NA            <NA>          <NA>              <NA>             <NA>
## 6          NA            <NA>          <NA>              <NA>             <NA>
##   gtd_targsubtype3_txt gtd_corp3 gtd_target3 gtd_natlty3 gtd_natlty3_txt
## 1                 <NA>      <NA>        <NA>        <NA>            <NA>
## 2                 <NA>      <NA>        <NA>        <NA>            <NA>
## 3                 <NA>      <NA>        <NA>        <NA>            <NA>
## 4                 <NA>      <NA>        <NA>        <NA>            <NA>
## 5                 <NA>      <NA>        <NA>        <NA>            <NA>
## 6                 <NA>      <NA>        <NA>        <NA>            <NA>
##   gtd_actor_tax gtd_gsubname gtd_gname2 gtd_gsubname2 gtd_gname3 gtd_gsubname3
## 1          <NA>         <NA>       <NA>          <NA>       <NA>          <NA>
## 2    Boko Haram         <NA>       <NA>          <NA>       <NA>          <NA>
## 3       Unknown         <NA>       <NA>          <NA>       <NA>          <NA>
## 4    Boko Haram         <NA>       <NA>          <NA>       <NA>          <NA>
## 5    Boko Haram         <NA>       <NA>          <NA>       <NA>          <NA>
## 6    Boko Haram         <NA>       <NA>          <NA>       <NA>          <NA>
##                                                                                                gtd_motive
## 1                                                                                                    <NA>
## 2                                                          The specific motive for the attack is unknown.
## 3                                                          The specific motive for the attack is unknown.
## 4 Specific motive is unknown; however, Boko Haram is engaged in an active campaign to enforce Sharia law.
## 5                                                                                                 Unknown
## 6                                                                                                 Unknown
##   gtd_guncertain1 gtd_guncertain2 gtd_guncertain3 gtd_individual gtd_nperps
## 1              NA              NA            <NA>             NA         NA
## 2               1              NA            <NA>              0          2
## 3               0              NA            <NA>              0          1
## 4               1              NA            <NA>              0        -99
## 5               1              NA            <NA>              0          3
## 6               1              NA            <NA>              0         20
##   gtd_nperpcap gtd_claimed gtd_claimmode gtd_claimmode_txt gtd_claim2
## 1           NA          NA            NA              <NA>         NA
## 2            0           0            NA              <NA>         NA
## 3            0           0            NA              <NA>         NA
## 4            0           0            NA              <NA>         NA
## 5            0           0            NA              <NA>         NA
## 6            0           0            NA              <NA>         NA
##   gtd_claimmode2 gtd_claimmode2_txt gtd_claim3 gtd_claimmode3
## 1           <NA>               <NA>       <NA>           <NA>
## 2           <NA>               <NA>       <NA>           <NA>
## 3           <NA>               <NA>       <NA>           <NA>
## 4           <NA>               <NA>       <NA>           <NA>
## 5           <NA>               <NA>       <NA>           <NA>
## 6           <NA>               <NA>       <NA>           <NA>
##   gtd_claimmode3_txt gtd_compclaim gtd_weaptype1         gtd_weaptype1_txt
## 1               <NA>          <NA>            NA                      <NA>
## 2               <NA>          <NA>             5                  Firearms
## 3               <NA>          <NA>             6 Explosives/Bombs/Dynamite
## 4               <NA>          <NA>             5                  Firearms
## 5               <NA>          <NA>             5                  Firearms
## 6               <NA>          <NA>             6 Explosives/Bombs/Dynamite
##   gtd_weapsubtype1          gtd_weapsubtype1_txt gtd_weaptype2
## 1               NA                          <NA>            NA
## 2                5              Unknown Gun Type            NA
## 3               17          Other Explosive Type            NA
## 4                4 Rifle/Shotgun (non-automatic)            NA
## 5                2              Automatic Weapon            NA
## 6               16        Unknown Explosive Type            NA
##   gtd_weaptype2_txt gtd_weapsubtype2 gtd_weapsubtype2_txt gtd_weaptype3
## 1              <NA>               NA                 <NA>            NA
## 2              <NA>               NA                 <NA>            NA
## 3              <NA>               NA                 <NA>            NA
## 4              <NA>               NA                 <NA>            NA
## 5              <NA>               NA                 <NA>            NA
## 6              <NA>               NA                 <NA>            NA
##   gtd_weaptype3_txt gtd_weapsubtype3 gtd_weapsubtype3_txt gtd_weaptype4
## 1              <NA>               NA                 <NA>          <NA>
## 2              <NA>               NA                 <NA>          <NA>
## 3              <NA>               NA                 <NA>          <NA>
## 4              <NA>               NA                 <NA>          <NA>
## 5              <NA>               NA                 <NA>          <NA>
## 6              <NA>               NA                 <NA>          <NA>
##   gtd_weaptype4_txt gtd_weapsubtype4 gtd_weapsubtype4_txt
## 1              <NA>             <NA>                 <NA>
## 2              <NA>             <NA>                 <NA>
## 3              <NA>             <NA>                 <NA>
## 4              <NA>             <NA>                 <NA>
## 5              <NA>             <NA>                 <NA>
## 6              <NA>             <NA>                 <NA>
##                                           gtd_weapdetail gtd_nkill gtd_nkillus
## 1                                                   <NA>        NA          NA
## 2              Unknown firearms were used in the attack.         2           0
## 3 An improvised explosive device was used in the attack.        10           0
## 4                                     Kalashnikov rifles         5           0
## 5                                     Kalashnikov rifles         1           0
## 6                   A building was bombed and burnt down         0           0
##   gtd_nkillter gtd_nwound gtd_nwoundus gtd_nwoundte gtd_property gtd_propextent
## 1           NA         NA           NA           NA           NA             NA
## 2            0          0            0            0            0             NA
## 3            0         34            0            0           -9             NA
## 4            0          0            0            0           -9             NA
## 5            0          0            0            0           -9             NA
## 6            0          0            0            0            1              4
##   gtd_propextent_txt gtd_propvalue
## 1               <NA>            NA
## 2               <NA>            NA
## 3               <NA>            NA
## 4               <NA>            NA
## 5               <NA>            NA
## 6            Unknown            NA
##                                                         gtd_propcomment
## 1                                                                  <NA>
## 2                                                                  <NA>
## 3 The attack caused an unknown amount of property damage to the school.
## 4                                                                  <NA>
## 5                                                                  <NA>
## 6                              A police post was bombed and burned down
##   gtd_ishostkid gtd_nhostkid gtd_nhostkidus gtd_nhours gtd_ndays gtd_divert
## 1            NA           NA             NA         NA        NA       <NA>
## 2             0           NA             NA         NA        NA       <NA>
## 3             0           NA             NA         NA        NA       <NA>
## 4             0           NA             NA         NA        NA       <NA>
## 5             0           NA             NA         NA        NA       <NA>
## 6             0           NA             NA         NA        NA       <NA>
##   gtd_kidhijcountry gtd_ransom gtd_ransomamt gtd_ransomamtus gtd_ransompaid
## 1              <NA>         NA            NA            <NA>             NA
## 2              <NA>         NA            NA            <NA>             NA
## 3              <NA>         NA            NA            <NA>             NA
## 4              <NA>         NA            NA            <NA>             NA
## 5              <NA>         NA            NA            <NA>             NA
## 6              <NA>         NA            NA            <NA>             NA
##   gtd_ransompaidus gtd_ransomnote gtd_hostkidoutcome gtd_hostkidoutcome_txt
## 1             <NA>           <NA>                 NA                   <NA>
## 2             <NA>           <NA>                 NA                   <NA>
## 3             <NA>           <NA>                 NA                   <NA>
## 4             <NA>           <NA>                 NA                   <NA>
## 5             <NA>           <NA>                 NA                   <NA>
## 6             <NA>           <NA>                 NA                   <NA>
##   gtd_nreleased
## 1            NA
## 2            NA
## 3            NA
## 4            NA
## 5            NA
## 6            NA
##                                                                                                                                                                                                                                           gtd_addnotes
## 1                                                                                                                                                                                                                                                 <NA>
## 2                                                                                                                                                                                                                                                 <NA>
## 3 The most recent available sources listed the fatalities for this attack from three to 10, and the injuries for this attack from 21 to 34, so the majority casualty figures have been used in order to preserve statistical accuracy in the database.
## 4                                                                                                                                                                                                                                                 <NA>
## 5                                                                                                                                                                                                                                                 <NA>
## 6                                                                                                                                                                                                                                                 <NA>
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              gtd_scite1
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  <NA>
## 2 Yahoo News, "Police: Radical Sect Kills Three in Northeast Nigeria," Associated Press, http://news.yahoo.com/police-radical-sect-kills-3-northeast-nigeria-163009693.html;_ylt=AsyNFAPfBy7lw5yKvmQ_G6696Q8F;_ylu=X3oDMTQ0Nmc3b2t0BG1pdANUb3BTdG9yeSBXb3JsZFNGIEFmcmljYVNTRgRwa2cDNzc5YjBjYTItMWU1Mi0zYTQzLThhMmItMTFhMmQ1NmQ3ODRjBHBvcwM5BHNlYwN0b3Bfc3RvcnkEdmVyAzE2NTZkNmEwLWVkZGQtMTFlMC1iOWJmLTdhNDJlZDhhODIyZg--;_ylg=X3oDMTFxaTJhMjZtBGludGwDdXMEbGFuZwNlbi11cwRwc3RhaWQDBHBzdGNhdAN3b3JsZHxhZnJpY2EEcHQDc2VjdGlvbnM-;_ylv=3 (October 3, 2011).
## 3                                                                                                                                                                                                                                                                                                                                                                          Chinwendu Nnadozi, "Explosion Kills 10 at PDP Rally in Niger," Daily Independent, March 03, 2011, http://www.independentngonline.com/DailyIndependent/Article.aspx?id=29876.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              "Update:APNewsNow," Associated Press, December 17, 2011.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                          Nazifi Dawud Khalid, "Gunmen Kill Protocol Officer, Herbalist in Maiduguri," Daily Trust, November 28, 2011.
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                              "Suspected Islamists attack Nigerian police post, govt office," Agence France Presse, November 10, 2011.
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            gtd_scite2
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <NA>
## 2 Yahoo News, "Gunmen Kill Three in Violence-torn Nigerian City: Police," Agence France Presse, http://news.yahoo.com/gunmen-kill-three-violence-torn-nigerian-city-police-181747975.html;_ylt=Argzz.1hYeHUn2SzxZ5zGEK96Q8F;_ylu=X3oDMTQ0aW8zb2JmBG1pdANUb3BTdG9yeSBXb3JsZFNGIEFmcmljYVNTRgRwa2cDMTJkNTg0ZTEtZTJiOC0zYjViLWFhNGMtYWViN2YzZmVmMGFhBHBvcwM0BHNlYwN0b3Bfc3RvcnkEdmVyAzYxN2FmZGYwLWVkZWMtMTFlMC1iNmYxLTExZmRmMGMzOTllOA--;_ylg=X3oDMTFxaTJhMjZtBGludGwDdXMEbGFuZwNlbi11cwRwc3RhaWQDBHBzdGNhdAN3b3JsZHxhZnJpY2EEcHQDc2VjdGlvbnM-;_ylv=3 (October 3, 2011).
## 3                                                                                                                                                                                                                                                                                                                                                                              Xinhua News Agency, "Ten People Feared Killed in Nigeria Rally Bomb Blast," Xinhua News Agency, March 04, 2011, http://news.xinhuanet.com/english2010/world/2011-03/04/c_13760030.htm.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <NA>
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <NA>
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                        Michael Olugbode and John Shiklam, "Gunmen Attack Police Station, FRSC Office," This Day, November 11, 2011.
##                                                                                                                                                                                                                                                                                                                     gtd_scite3
## 1                                                                                                                                                                                                                                                                                                                         <NA>
## 2 Washington Post, "Police: Radical Muslim Sect Kills Tea Seller, Pharmacist and Bystander in Nigeria's Northeast," Associated Press, October 3, 2011, http://www.washingtonpost.com/world/africa/police-radical-muslim-sect-kills-tea-seller-pharmacist-and-bystander-in-nigerias-northeast/2011/10/03/gIQAPQoFIL_story.html.
## 3                                                                                                                                                                                            Jane\xd5s Intelligence, \xd2IED Attack Targets Political Rally near Nigerian Capital,\xd3 Terrorism Watch Report, March 03, 2011.
## 4                                                                                                                                                                                                                                                                                                                         <NA>
## 5                                                                                                                                                                                                                                                                                                                         <NA>
## 6                                                                                                                                                                                                                                                                                                                         <NA>
##               gtd_dbsource gtd_INT_LOG gtd_INT_IDEO gtd_INT_MISC gtd_INT_ANY
## 1                     <NA>          NA           NA           NA          NA
## 2                     ISVG           0            0            0           0
## 3                     ISVG          -9           -9            0          -9
## 4 START Primary Collection           0            0            0           0
## 5 START Primary Collection           0            0            0           0
## 6 START Primary Collection           0            0            0           0
##                  gtd_related   gtd_date gtd_data.source gtd_enddate
## 1                       <NA>       <NA>            <NA>        <NA>
## 2                       <NA> 2011-10-03             gtd  2011-10-03
## 3                       <NA> 2011-03-02             gtd  2011-03-02
## 4                       <NA> 2011-12-15             gtd  2011-12-15
## 5 201111270022, 201111270023 2011-11-27             gtd  2011-11-27
## 6 201111090012, 201111090013 2011-11-09             gtd  2011-11-09
##   scad_eventid scad_id scad_ccode scad_countryname  scad_date scad_enddate
## 1      4751061    1061        475          Nigeria 2011-11-03   2011-11-03
## 2           NA      NA         NA             <NA>       <NA>         <NA>
## 3      4750991     991        475          Nigeria 2011-03-03   2011-03-03
## 4      4751078    1078        475          Nigeria 2011-12-15   2011-12-15
## 5           NA      NA         NA             <NA>       <NA>         <NA>
## 6      4751065    1065        475          Nigeria 2011-11-10   2011-11-10
##   scad_duration scad_stday scad_stmo scad_styr scad_eday scad_emo scad_eyr
## 1             1          3        11      2011         3       11     2011
## 2            NA         NA        NA        NA        NA       NA       NA
## 3             1          3         3      2011         3        3     2011
## 4             1         15        12      2011        15       12     2011
## 5            NA         NA        NA        NA        NA       NA       NA
## 6             1         10        11      2011        10       11     2011
##   scad_event_tax scad_escalation      scad_actor_tax scad_actor2 scad_actor3
## 1              8               0  Suspected militant        <NA>        <NA>
## 2             NA              NA                <NA>        <NA>        <NA>
## 3              9               0  Unknown assailants        <NA>        <NA>
## 4              9               0 Suspected militants        <NA>        <NA>
## 5             NA              NA                <NA>        <NA>        <NA>
## 6              8               0 Suspected militants        <NA>        <NA>
##                scad_target1 scad_target2 scad_cgovtarget scad_rgovtarget
## 1                  Soldiers         <NA>               1               0
## 2                      <NA>         <NA>              NA              NA
## 3 Peoples Demoncratic Party         <NA>               0               0
## 4                 Civilians         <NA>               0               0
## 5                      <NA>         <NA>              NA              NA
## 6                    Police         <NA>               1               0
##   scad_npart scad_ndeath scad_repress scad_elocal scad_ilocal scad_sublocal
## 1          1           1            0   Maiduguri   Maiduguri             1
## 2         NA          NA           NA        <NA>        <NA>            NA
## 3          1           4            0      Suleja      Suleja             1
## 4        -99           5            0   Maiduguri   Maiduguri             1
## 5         NA          NA           NA        <NA>        <NA>            NA
## 6        -99           2            0      Mainok      Mainok             1
##   scad_locnum scad_gislocnum scad_issue1 scad_issue2 scad_issue3
## 1           2              2           6          NA          NA
## 2          NA             NA          NA          NA          NA
## 3           2              2           1          NA          NA
## 4           2              2           6          NA          NA
## 5          NA             NA          NA          NA          NA
## 6           3              3           6          NA          NA
##                                                                                                                         scad_issuenote
## 1                                                                                  A suspected militant opens fire, killing a soldier.
## 2                                                                                                                                 <NA>
## 3 Assailants toss a bomb at an election rally from a moving car.  They miss their target, instead hitting a roadside vegetable market.
## 4                                                        Suspected Boko Haram militants shoot dead 5 civilians in a drive-by shooting.
## 5                                                                                                                                 <NA>
## 6                                                                                           Suspected militants bomb a police station.
##   scad_nsource scad_notes scad_coder scad_acd_questionable scad_latitude
## 1            0       <NA>         CL                     1      11.83330
## 2         <NA>       <NA>       <NA>                    NA            NA
## 3            1       <NA>         CL                     1       9.18052
## 4            1       <NA>         CL                     1      11.83330
## 5         <NA>       <NA>       <NA>                    NA            NA
## 6            1       <NA>         CL                     1      11.82880
##   scad_longitude scad_geo_comments scad_location_precision scad_year
## 1       13.15000              <NA>                    <NA>      2011
## 2             NA              <NA>                    <NA>        NA
## 3        7.17933              <NA>                    <NA>      2011
## 4       13.15000              <NA>                    <NA>      2011
## 5             NA              <NA>                    <NA>        NA
## 6       12.63450              <NA>                    <NA>      2011
##   scad_data.source ged_id ged_year ged_active_year ged_event_tax
## 1             scad  42105     2011               1             1
## 2             <NA>  42306     2011               1             3
## 3             scad  42272     2011               1             3
## 4             scad     NA       NA              NA            NA
## 5             <NA>  42110     2011               1             1
## 6             scad  42106     2011               1             1
##   ged_conflict_new_id                                      ged_conflict_name
## 1                 297                                     Nigeria:Government
## 2                1850 Jama'atu Ahlis Sunna Lidda'awati wal-Jihad - Civilians
## 3                1850 Jama'atu Ahlis Sunna Lidda'awati wal-Jihad - Civilians
## 4                  NA                                                   <NA>
## 5                 297                                     Nigeria:Government
## 6                 297                                     Nigeria:Government
##   ged_dyad_new_id
## 1             640
## 2            2332
## 3            2332
## 4              NA
## 5             640
## 6             640
##                                                        ged_dyad_name
## 1 Government of Nigeria - Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 2             Jama'atu Ahlis Sunna Lidda'awati wal-Jihad - Civilians
## 3             Jama'atu Ahlis Sunna Lidda'awati wal-Jihad - Civilians
## 4                                                               <NA>
## 5 Government of Nigeria - Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 6 Government of Nigeria - Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
##   ged_side_a_new_id ged_gwnoa                              ged_actor_tax
## 1                84       475                      Government of Nigeria
## 2              1051        NA Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 3              1051        NA Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 4                NA        NA                                       <NA>
## 5                84       475                      Government of Nigeria
## 6                84       475                      Government of Nigeria
##   ged_side_b_new_id ged_gwnob                                 ged_side_b
## 1              1051      <NA> Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 2                 1      <NA>                                  Civilians
## 3                 1      <NA>                                  Civilians
## 4                NA      <NA>                                       <NA>
## 5              1051      <NA> Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
## 6              1051      <NA> Jama'atu Ahlis Sunna Lidda'awati wal-Jihad
##   ged_number_of_sources
## 1                    -1
## 2                    -1
## 3                    -1
## 4                    NA
## 5                    -1
## 6                    -1
##                                                                                                                                                                                                                                                                        ged_source_article
## 1                                                                                                                                                                                                   Agence France Presse 4 November 2011 "Soldier shot dead amid arms searches in Nigeria
## 2                                                                                                                                                                                           Agence France Presse 3 October 2011 "Gunmen kill three in violence-torn Nigerian city: police
## 3 Reuters News 3 March 2011 "UPDATE 1-Three killed in explosion at Nigeria election rally"; Agence France Presse 8 March 2011 "Nigerian opposition politician charged with bombing"; The Guardian/BBC 14 September 2011 "Nigerian court remands suspected Islamic sect members to custody
## 4                                                                                                                                                                                                                                                                                    <NA>
## 5                                                                                      Agence France Presse 28 November 2011 "Nigerian Islamists kill state governor's aide"; Daily Trust/BBC 28 November 2011 "Nigeria: Suspected Boko Haram gunmen kill protocol officer in Borno State
## 6                                                                                                                                                                                     Agence France Presse 10 November 2011 "Suspected Islamists attack Nigerian police post, govt office
##   ged_source_office ged_source_date ged_source_headline
## 1              <NA>              NA                <NA>
## 2              <NA>              NA                <NA>
## 3              <NA>              NA                <NA>
## 4              <NA>              NA                <NA>
## 5              <NA>              NA                <NA>
## 6              <NA>              NA                <NA>
##               ged_source_original ged_where_prec ged_where_coordinates
## 1 Borno state police commissioner              1        Maiduguri town
## 2              state police chief              1        Maiduguri town
## 3                          police              1           Suleja town
## 4                            <NA>             NA                  <NA>
## 5                          police              1        Maiduguri town
## 6               police, witnesses              1           Mainok town
##     ged_adm_1     ged_adm_2 ged_latitude ged_longitude
## 1 Borno state Maiduguri lga     11.84644      13.16027
## 2 Borno state Maiduguri lga     11.84644      13.16027
## 3 Niger state    Suleja lga      9.18052       7.17933
## 4        <NA>          <NA>           NA            NA
## 5 Borno state Maiduguri lga     11.84644      13.16027
## 6 Borno state      Kaga lga     11.83022      12.63067
##                  ged_geom_wkt ged_priogrid_gid ged_country ged_country_id
## 1 POINT (13.160274 11.846440)           146547     Nigeria            475
## 2 POINT (13.160274 11.846440)           146547     Nigeria            475
## 3   POINT (7.179330 9.180520)           142935     Nigeria            475
## 4                        <NA>               NA        <NA>             NA
## 5 POINT (13.160274 11.846440)           146547     Nigeria            475
## 6 POINT (12.630670 11.830220)           146546     Nigeria            475
##   ged_region ged_event_clarity ged_date_prec   ged_date ged_enddate
## 1     Africa                 1             1 2011-11-02  2011-11-02
## 2     Africa                 1             1 2011-10-03  2011-10-03
## 3     Africa                 1             1 2011-03-03  2011-03-03
## 4       <NA>                NA            NA       <NA>        <NA>
## 5     Africa                 1             1 2011-11-27  2011-11-27
## 6     Africa                 1             1 2011-11-09  2011-11-09
##   ged_deaths_a ged_deaths_b ged_deaths_civilians ged_deaths_unknown ged_best
## 1            0            0                    0                  0        0
## 2            0            0                    0                  0        0
## 3            0            0                    0                  0        0
## 4           NA           NA                   NA                 NA       NA
## 5            0            0                    0                  0        0
## 6            0            0                    0                  0        0
##   ged_low ged_high ged_data.source acled_gwno acled_event_id_cnty
## 1       0        1             ged         NA                <NA>
## 2       0        3             ged         NA                <NA>
## 3       0        3             ged         NA                <NA>
## 4      NA       NA            <NA>         NA                <NA>
## 5       0        2             ged         NA                <NA>
## 6       0        4             ged         NA                <NA>
##   acled_event_id_no_cnty acled_date acled_year acled_time_precision
## 1                     NA       <NA>         NA                   NA
## 2                     NA       <NA>         NA                   NA
## 3                     NA       <NA>         NA                   NA
## 4                     NA       <NA>         NA                   NA
## 5                     NA       <NA>         NA                   NA
## 6                     NA       <NA>         NA                   NA
##   acled_event_tax acled_actor_tax acled_ally_actor_1 acled_inter1
## 1            <NA>            <NA>               <NA>           NA
## 2            <NA>            <NA>               <NA>           NA
## 3            <NA>            <NA>               <NA>           NA
## 4            <NA>            <NA>               <NA>           NA
## 5            <NA>            <NA>               <NA>           NA
## 6            <NA>            <NA>               <NA>           NA
##   acled_actor1_id acled_actor2 acled_ally_actor_2 acled_inter2 acled_actor2_id
## 1              NA         <NA>               <NA>           NA              NA
## 2              NA         <NA>               <NA>           NA              NA
## 3              NA         <NA>               <NA>           NA              NA
## 4              NA         <NA>               <NA>           NA              NA
## 5              NA         <NA>               <NA>           NA              NA
## 6              NA         <NA>               <NA>           NA              NA
##   acled_interaction acled_actor_dyad_id acled_country acled_admin1 acled_admin2
## 1                NA                <NA>          <NA>         <NA>         <NA>
## 2                NA                <NA>          <NA>         <NA>         <NA>
## 3                NA                <NA>          <NA>         <NA>         <NA>
## 4                NA                <NA>          <NA>         <NA>         <NA>
## 5                NA                <NA>          <NA>         <NA>         <NA>
## 6                NA                <NA>          <NA>         <NA>         <NA>
##   acled_admin3 acled_location acled_latitude acled_longitude
## 1         <NA>           <NA>             NA              NA
## 2         <NA>           <NA>             NA              NA
## 3         <NA>           <NA>             NA              NA
## 4         <NA>           <NA>             NA              NA
## 5         <NA>           <NA>             NA              NA
## 6         <NA>           <NA>             NA              NA
##   acled_geo_precision acled_source acled_notes acled_fatalities
## 1                  NA         <NA>        <NA>               NA
## 2                  NA         <NA>        <NA>               NA
## 3                  NA         <NA>        <NA>               NA
## 4                  NA         <NA>        <NA>               NA
## 5                  NA         <NA>        <NA>               NA
## 6                  NA         <NA>        <NA>               NA
##   acled_data.source acled_enddate
## 1              <NA>          <NA>
## 2              <NA>          <NA>
## 3              <NA>          <NA>
## 4              <NA>          <NA>
## 5              <NA>          <NA>
## 6              <NA>          <NA>

The number of entries corresponds with the number of located matches.

dim(dups)
## [1] 150 263

By examining the nature of the overlapping output, we can get a better understanding of what events matched and why. The information regarding overlapping events could be just as valuable as a de-duplicated frame, given the research question. meltt.duplicates() allows the research to make these kinds of inquiries.

Again, let’s extract descriptive information to qualitatively compare events, given the available text descriptions of the events. This offers a quick way to see how well our input assumptions performed when merging events.

dups2 <- meltt_duplicates(output,
                          columns = c("notes","summary","issuenote",
                                      "source_headline"))
head(dups2) 
##   acled_dataset acled_event ged_dataset ged_event scad_dataset scad_event
## 1             0           0           2        66            3        105
## 2             0           0           2       194            0          0
## 3             0           0           2       160            3         21
## 4             0           0           0         0            3        123
## 5             0           0           2        70            0          0
## 6             0           0           2        67            3        110
##   gtd_dataset gtd_event
## 1           0         0
## 2           4       114
## 3           4        32
## 4           4       163
## 5           4       156
## 6           4       143
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   gtd_summary
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <NA>
## 2                                                                                                                                                                                                                                                                       10/03/2011: On Monday morning, in Maiduguri, Borno, Nigeria, two militants fired upon and killed a tea seller and a civilian outside a tea shop. No group has claimed responsibility, but the militant group Boko Haram was thought to be responsible for the attack.
## 3                                                                             03/02/2011: On Wednesday afternoon around 1330, in Suleja, Niger, Nigeria, 10 people were killed and 34 others were injured when one man threw an improvised explosive device at a Peoples Democratic Party campaign rally for Niger Governor Babangida Aliyu, at a Nigerian government secondary school. Babangida Aliyu was not injured, but one bus sustained an unknown amount of damage in the attack. No group has claimed responsibility for the attack.
## 4                                                                                                                                                                                 12/15/2011: Suspected members of Boko Haram opened fire on a group of civilians standing outside of a shop in Maiduguri city, Borno state, Nigeria.  Five civilians were killed in the shooting; however, there were no reported injuries.  The assailants were traveling in a vehicle at the time of the attack and fled the scene following the shooting.
## 5 11/27/2011: Three unidentified gunmen shot and killed a government employee in Gwange ward of Maiduguri city, Borno state, Nigeria.  The victim, Kala Boro, was a protocol officer for the Borno state Government House.  The assailants followed him home from work and shot him while he was in his car.  This was one of two multiple incidents; the assailants killed an herbalist in a separate incident after killing Boro.  No group claimed responsibility for the incident; however, sources suspect the involvement of Boko Harm.
## 6                                                                                                            11/9/2011: Approximately 20 suspected members of Boko Haram attacked a police station in Mainok village, Borno state, Nigeria.  The assailants threw explosives inside and burned the police station down; there were no reported injuries as the police station had been closed some time before.  The attack on the police station happened in conjunction with an attack on a federal road safety office in the same village.
##                                                                                                                         scad_issuenote
## 1                                                                                  A suspected militant opens fire, killing a soldier.
## 2                                                                                                                                 <NA>
## 3 Assailants toss a bomb at an election rally from a moving car.  They miss their target, instead hitting a roadside vegetable market.
## 4                                                        Suspected Boko Haram militants shoot dead 5 civilians in a drive-by shooting.
## 5                                                                                                                                 <NA>
## 6                                                                                           Suspected militants bomb a police station.
##   scad_notes ged_source_headline acled_notes
## 1       <NA>                <NA>        <NA>
## 2       <NA>                <NA>        <NA>
## 3       <NA>                <NA>        <NA>
## 4       <NA>                <NA>        <NA>
## 5       <NA>                <NA>        <NA>
## 6       <NA>                <NA>        <NA>

Inspecting potential event and episode matches

As noted above, event-to-episode matches are flagged, but not automatically matched. To do this, the user needs to inspect the flagged entries and dictate which are actual matches and which are not. Again, we implement the user step given that events and episodes technically occur at different units of analysis, and thus require discretion when ultimately determining their status as unique or duplicate entries. Note that we are developing a shiny app to help ease this assessment process.

The meltt.inspect() function streamlines this process. The function outputs a list that contains comparative information on each potential event and episode match.

assess = meltt_inspect(output)
## 
## Note:
## 40 entries flagged as event-to-episode matches. List generated for user evaluation for all potential matches.

The user then manually reviews each entry by cycling through the outputted list object.

# Information on the event
assess[[1]]$`Flagged Event Information`
##    dataset obs.count data.source       date    enddate latitude longitude
## 61       1        60       acled 2011-02-07 2011-02-07   7.5887    8.2087
##                     event_tax                       actor_tax
## 61 Violence against civilians Fulani Ethnic Militia (Nigeria)
# Information on the episode
assess[[1]]$`Flagged Episode Information`
##     dataset obs.count data.source       date    enddate latitude longitude
## 476       2       118         ged 2011-02-07 2011-02-11   7.5834    8.2055
##     event_tax actor_tax
## 476         2    Fulani

The function takes an object of class meltt and has two accompanying arguments: columns and confirmed_matches. When two events are found to match, the user can specify this information to fold in those entries as de-duplicated entries in the return frame. To accomplish this, the user must provide a Boolean argument that is equal length of the total number of flagged entries. In this manner, flagged entries marked as TRUE are treated as matches, and those marked as FALSE are treated as unique. The returned frame then reflects output similar to meltt.data().

By way of example:

length(assess)
## [1] 40
retain = rep(F,length(assess))
retain[1:20] = T # Let's say half are ID'ed as duplicates
retain
##  [1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [13]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE
uevents3  = meltt_inspect(output,columns="event_tax",confirmed_matches = retain)
## All confirmed event-to-episode duplicates have been removed.
dim(uevents3)
## [1] 687   6

Note that the total number of de-duplicated events has fallen, reflecting the newly identified (and now removed) duplicates of existing events. 691 of the original de-duplicated entries reduces to 687. Note that this reduction is a feature of the fact that multiple events can cling to the same episode. Thus, there are 40 events flagged as matching to episodes, but only 18 unique episodes that can potentially be removed as duplicates. Thus, emphasizing the need for user discretion.

Inside the Output Object

Like most S3 objects, the output from meltt is a nested list containing a range of useful information. The output from meltt retains the original input data and taxonomies and the specification assumptions as well as lists of contender events (i.e. events that were flagged as potential matches but did not match as closely as another event). Note that we are expanding meltt’s functionality to include more posterior function to ease extraction of this information, but for now, it can simply be accessed using the usual $ key convention.

names(output)
## [1] "processed"      "inputData"      "parameters"     "inputDataNames"
## [5] "taxonomy"
head(output$processed$event_contenders)
##   dataset event bestmatch_data bestmatch_event bestmatch_score runnerUp1_data
## 1       1    24              2               7       0.5833333              0
## 2       1    58              2              85       0.3333333              0
## 3       1    69              2             236       0.5000000              0
## 4       1    78              2               8       0.4166667              0
## 5       1   103              2             106       0.3333333              0
## 6       1   177              2             204       0.2500000              0
##   runnerUp1_event runnerUp1_score runnerUp2_data runnerUp2_event
## 1               0               0              0               0
## 2               0               0              0               0
## 3               0               0              0               0
## 4               0               0              0               0
## 5               0               0              0               0
## 6               0               0              0               0
##   runnerUp2_score events_matched
## 1               0              1
## 2               0              1
## 3               0              1
## 4               0              1
## 5               0              1
## 6               0              1