“The Fighting Season” & Detecting Seasonality Using Facebook’s Prophet
In this analysis, I use statistics from the Armed Conflict Location & Event Data Project (“ACLED”). ACLED is a disaggregated conflict collection, analysis and crisis mapping project that collects the dates, actors, types of violence, locations, and fatalities of all reported political violence and protest events across the globe. In this specific instance, I am utilizing ACLED’s data for Afghanistan from 2017 to March 2019.
Bar charts & Box plots
In order to identify the existance of seasonality in Afghan daily fatality data, we will first explore characteristics in the data by visually examining mean monthly fatality data. Here, Figure 2 shows the mean monthly fatality data for Afghanistan. It’s possible to discern some seasonality in the monthly mean, for example May, July and August have the highest monthly means.
Figure 3, shows a boxplot of the daily fatality data grouped by month. The boxplot includes the mean, median, max, min, percentiles, and the outliers. By examining the boxplots, it is possible to visually discern some evidence of a seasonal pattern. Outlier data points are highest among the “fighting season”, and median daily fatality for each month are similarly seasonal. The boxplot also shows that the distribution varies by month. For example, daily fatality data in July has significant skew. Overall, the boxplots seem to support the hypothesis that Afghan fatality data have an annual seasonal pattern with higher levels between May-August, and lower levels between November-February.
However, it is also clear that June has a consistently lower total fatality level in Afghanistan. It it thus important to consider the effect of certain Islamic observances such as Ramadan that may potentially be affecting the data. Ramadan occurs on the ninth month of the Islamic calendar, and is observed by Muslims worldwide as a month of fasting (Sawm). Typically, Ramadan occurs sometime between the months of May and June in the Gregorian calendar (used here), and lasts 29–30 days. As the observance of Ramadan may reduce the frequency of armed clashes between different actors, it is important to take into account such effects when examining seasonality in the data.
Facebook’s Prophet: Fourier Order for Seasonalities
Prophet is open source forecasting procedure released by Facebook’s Core Data Science team. It provides a completely automated forecast of time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. Prophet is also robust to missing data and shifts in the trend, and typically handles outliers well.
Prophet relies on a Fourier series to provide a flexible model of periodic effects (Taylor & Letham, 2017). A Fourier series is basically a way to represent a function as the sum of simple sine waves. Prophet utilizes it to estimates seasonality in the time series data. A partial Fourier sum can approximate an arbitrary periodic signal, and the numbers of terms in the partial sum (the order) is a parameter that determines how quickly the seasonality can change. For this article, Prophet was implemented using R.
Figure 4 above shows the initial forecast from Prophet. However, this plot doesn’t show seasonality. By using the ‘prophets_plot_components()’ function, the forecast can be broken down into its individual components.
Here, Figure 5 shows the daily fatality data from Afghanistan broken down into its individual components: trend, weekly seasonality, and yearly seasonality. The ‘yearly’ component shows strong evidence of seasonality, with the fatality numbers increasing during the spring and summer seasons, then decreasing during the winter. The frequency of fatalities from armed clashes between Afghan security forces and militia groups clearly varies in a seasonal pattern, and strongly corresponds with the historic fighting season.
Interestingly enough, there also seems to be evidence of weekly seasonality, with Saturday to Monday having the highest fatality numbers. Additionally, the trend component shows the forecast of total fatality numbers to be on the rise.