Schedule a Demo Sign In

Anticipating riots all over the world, a case study of collaboration between Citibeats and the IAE-CSIC/BSE

Red image with fire and a riot
In late 2022 Citibeats began partnerships with local universities. As part of this partnership,  Citibeats shared aggregated data that was generated by its NLP and AI algorithms thereby providing a team of academic researchers with valuable information to help with their own research. This article describes one big achievement resulting from the collaboration between the IAE-CSIC, the Barcelona School of Economics (BSE) and Citibeats. The team has leveraged Citibeats data to predict riots and link them to the underlying social concerns in several subregions of the world. Here are the main highlights of this collaboration.

Origin of the problem


The last decade has seen a dramatic increase in riot events across the world (2020 Global Peace Index). The list of examples is extensive: anti-government protests in Iran, demonstrations in Hong Kong against extradition laws, the Black Lives Matter movement, and protests across Latin America against the economic and political system only to name a few.

Riots are violent events where demonstrators or mobs engage in disruptive acts, including but not limited to rock throwing, property destruction, etc. (source: ACLED)

Given that riots carry significant human, economic and societal costs, it is vital to understand the causes of riots to give decision-makers the opportunity to address the concerns of citizens before these turn into violence.

At the same time, riots are extremely difficult to forecast. They are typically spontaneous events that seem to arise out of nowhere. They say that riots need powder kegs and sparks, so if you want to predict the power of the explosion, you need to assess the size of the powder keg.

In a certain way, this is exactly the reason that led to collaboration between Citibeats and IAE-CSIC/BSE, i.e. to track possible root causes of riots and be able to predict their occurrence.

This is where Citibeats’ Social Risk Monitor came into play. The monitor provides social networks data categorized into key topics that capture citizens’ opinions about political and economic events in 90 countries. This joint collaboration between Citibeats and IAE-CSIC/BSE aims to link riots with Citibeats’ detected social signals in several countries with a model developed by IAE-CSIC. And the results were more than positive as the model actually managed to predict riot outbreaks. The model developed in this project is a forecast model without a claim to identify causal factors of riots. However, the huge forecast power of Citibeats’ Social Risk Monitor suggests that riots are fed by grievances relating to fundamental issues like jobs, water, health, or governance which Citibeats' methodology is able to track. Governments should pay close attention to these grievances if they want to avoid the outbreak of riots.

Data collection and descriptions

A pilot study of riot events was conducted in four Latin American countries: Argentina, Chile, Peru, and Venezuela from October 2020 to October 2022. Data was summarized at the subregional level.

The riot data was obtained from the Armed Conflict Location & Event Data Project (ACLED).  ACLED is a disaggregated data collection, analysis, and crisis mapping project, which collects information on the dates, actors, locations, fatalities, and types of all reported political violence and protest events around the world. The weekly counts of riot events are here considered. From these counts, the likelihood of a riot outbreak within the next four weeks for each subregion in our sample was calculated.

Citibeats data comes from different social networks (Facebook public pages, articles, Twitter, Reddit, and more). The data is processed through an analysis pipeline that quits noise and categorizes the texts within the 17 categories of the Social Risk Monitor. The 17 categories identify topics of citizens' concerns such as health, culture, environment, or governance. It also provides information related to several demographic segments, such as location, and whether the social network post carries questions or complaints.. The final data is an aggregate macro time series of the weekly interest in each category from October 2020 to October 2022 for more than 200 subregions within our 4 countries of interest.

The data is summarized in the table below:

Table 1: Overview of the data

Main results - Preliminary

*DISCLAIMER* The problem description has been simplified. The objective is to give an intuitive explanation of the different models.

To show how Citibeats data can better anticipate riots, the IAE-CSIC/BSE model was fitted into three different types of inputs:

  1. Historical riots data: based on a subregion’s past riot history, such as the number of weeks since the last riot, or the number of riots in the previous two weeks.
  2. Citibeats data: the weekly evolution of citizens’ interests by subregion provided by Citibeats Social Risk Monitor
  3. Historical and Citibeats data: a combination of both types of inputs.

The idea is to determine how Citibeats data can improve results compared to a model that is based on historical riots data only.

The ROC AUC Score was used to compare the performance of the different models. The ROC AUC Score calculates the area under the ROC curve which plots the True Positive Rate against the False Positive Rate. It thus captures the degree of separability i.e. it shows how much the model is capable of distinguishing between riots and non-riots. The higher the AUC, the better the model is at identifying riots occurrences.

When assessing the likelihood of a riot outbreak within the next four weeks, Citibeat’s data substantially improved explainability. Figure 2 contrasts the performance of the different models: history (blue), Citibeats (orange) and the combined model of history and text (green) in terms of the ROC AUC Curve. The larger the area under the curve, the better the performance of the model. As depicted in the figure, there are substantial gains in the area under the ROC AUC Curve as a result of incorporating Citibeats data. This indicates that with the Citibeats data the model is better at distinguishing weeks with riot events from those that did not have a riot.

On the other hand, figure 2 also displays the feature importances of the model. In the top 10 features that help to predict riots there are social concerns related to environment, employment, governance and migration. This is true across dramatically different contexts in different regions, countries, and even continents. This generality of risk indicators is a crucial finding as it points to the possibility of finding root causes in the Social Risk Monitor dimensions. The policy repercussions of this would be significant.

Figure 2: Results

Table 2: Results

Achievements of the experiment & Next steps

This brief post has the goal of highlighting the successful collaboration between IAE-CSIC/BSE and Citibeats. Citibeats enabled the researchers to improve their results with the data generated thanks to our AI and NLP algorithms. This work will be extended to more countries to find out and explore the robustness of the results across other continents. The work will especially tackle the causal link between Citibeats social signals and the riots. The main objective would be to explain the riots in order to provide guidelines on a more responsive way to focus and treat causes of the riots.

The other big success that is worth mentioning is the usefulness of the data generated by Citibeats. In the given case study Citibeats helped researchers with their own investigation, but this also demonstrates the overall relevance of the data generated by Citibeats on a daily basis across several countries. The generated data catch insights about citizens' needs and opinions that currently cannot be provided by any other dataset.

 If you are interested in getting access to our generated datasets don’t hesitate to get in touch with us!





Citibeats leverages ethical AI for social understanding. Gathering and analyzing unstructured data from social media comments, blog posts, forums, and more, our Sustainability and Social Risk Monitors provide insight into millions of unfolding conversations regarding inflation, protests, food shortages, and more—empowering world leaders to develop data-driven strategies and inclusive policies. 

Schedule a demo today to learn more.