Entity Extraction for Accurate Predictive Analytics

April 05, 2022 | Entity Extraction, Risk Management

Entity Extraction for Predictive Analytics

Organizations of All Kinds Need to Gauge the Probability of Potential Future Events

All organizations need to try to predict the future and many use predictive analytics to do so. Governments, international organizations (NGOs), and corporations must constantly monitor current events and high-impact incidents to predict the likelihood that they may lead to a crisis, geopolitical instability, loss of lives, supply chain and market disruptions, employee and asset safety risks, reputational damage, etc.

Governments and NGO’s Must Constantly Monitor a World-Wide Threat Environment

For governments and NGOs, it’s imperative, for example, to assess the likelihood that a country will make a peaceful transition from one regime to another without associated violence that might lead to civil war and, down the road, to a failed state. Or in the case of ongoing conflicts like the war in Ukraine, it’s critical, for example, to understand which country, Russia or Ukraine, is more likely to win the war or whether the conflict is likely to spill over to other areas.

It’s also necessary to track various indices of a country’s social and political health including factors like gang violence, level of corruption, political assassinations, ethnic and racial tensions, etc. The point of tracking these factors is to predict a crisis before it happens and allow a government or NGO some time to plan and take measures to mitigate the effects, whether unilaterally or through international institutions. Will the bad things going on currently in a country lead to civil war? If that does happen, what are the chances that there will be spillover effects onto its neighbors?

From the Syrian Civil War to other geopolitical disasters in the recent past, one wonders whether they could have been predicted and mitigated, if not averted altogether. Civil unrest in Syria in the early days, which grew out of the Arab Spring of 2011, was certainly building up in the form of a growing amount of protest and social media activity that resulted in a harsh Government crackdown. Real-time monitoring of streaming social media and traditional information sources could have captured these indicators of growing instability in real time. A predictive model could have recognized them as crisis precursors and detected the trend. A better assessment of the severity of the situation could have led to negotiations and compromises to diffuse the escalating tension.

Corporations Need to Understand What Is Likely to Happen in Many Areas

Corporations, particularly large ones, also make use of predictive analytics. They rely on similar indicators of political and social stability as outlined above. For example, they need to detect a variety of events and assess the likelihood that conditions may deteriorate and cause disruptions to their supply chains, security risks for their employees and assets, reputational damage for their brand, or increased costs.

For example, a company deciding whether to build factories or open new stores in an emerging market would want to utilize predictive analytics to forecast the socioeconomic and political stability of those emerging markets.

Some of the Most Important Information Is Contained in Unstructured Text Data

Predictive analytics is a valuable tool, but one challenge it faces is that it can handle only structured data. Social media in particular consists of staggering amounts of predominantly unstructured text data. In order to make predictive analytics effective in analyzing unstructured data, the latter needs to be transformed into structured data. There is an exciting technology that does this: Entity Extraction.

How Entity Extraction Unlocks the Predictive Power of Text

Entity Extraction is about recognizing semantic concepts in text. At a basic level, it identifies named entities such as names of people, organizations, and places. These are very useful for a number of applications, but they have very limited predictive power by themselves. Predictive models require more advanced Extraction capabilities, in particular:

  • Event Extraction to capture unfolding situations like boycotts, protests, attacks, riots, deaths, fires, etc. It’s not just about correctly recognizing the mention of an event. It’s also about identifying the event participants, that is, the Who, What, Where, and When of an event. (For more information see our related blog.)
  • Sentiment Analysis to detect likes, dislikes, and emotions that can reveal, for instance, growing discontent or angry disapproval. It’s not just about identifying positive and negative language. Through Entity- and Aspect-based Sentiment Analysis, it’s possible to pinpoint the specific aspects that those positive and negative sentiments are about, such as public opinion about ongoing street protests. (Related blog)
  • Relationship Extraction to connect entities and provide critical information like the location of a military base, the license plate of a vehicle, or the address of a building (Related blog)
  • Geotagging to disambiguate place names (e.g., which Paris?), to be able to associate other entities like people, groups, and events with those locations, and to provide the rich geospatial structured output needed for geographic visualization and GIS analysis. (Related blog)

In addition, so as to effectively support Predictive Analytics, ideal Entity Extraction also exhibits the following features:

  • Robustness given imperfect input such as misspellings, missing or misplaced punctuation, partial sentences, and other forms of non-grammatical and noisy text content, all of which are common in social media.
  • Multilingual support to be able to monitor text content in multiple languages.
  • High throughput and scalability for real-time monitoring to be able to deploy on any public or private cloud infrastructure or scale processing horizontally with the available computing power.

All these features are critical for turning unstructured data into the valuable structured data that predictive models require to be able to forecast future events. Entity Extraction is an exciting AI technology that can enable organizations to forecast events.