We’ve already discussed Entity Extraction and Relationship Extraction previously. They are great technologies to unearth critical information from a large volume of unstructured text data. Here we are going to discuss a more advanced type of extraction, Event Extraction.

While Relationship Extraction identifies semantic relationships between two entities (for example, an affiliation relationship between a person and organization or a spousal relationship between two people), Event Extraction analyzes text for “Who did What to Whom and Where and When,” and finds events with one or more participants in each event. It also extracts the location and time of the event if the document contains them. In this way, Event Extraction finds additional information in text that provides a richer picture of people, organizations, places, and other entities beyond what Relationship Extraction does.

For example, in the passage below, Event Extraction would recognize that this sentence contains an Arrest event. It would identify “Almazbek Atambayev” as the person arrested, “Kyrgyz police” as the entity carrying out the arrest, “Thursday” as the date, and “his rural home near the capital, Bishkek” as the location of the event.

“ Kyrgyz police detained former President Almazbek Atambayev on Thursday at his rural home near the capital, Bishkek.

How does Event Extraction Work?

Event Extraction detects an event in text, disambiguates and assigns its semantic type from its event ontology, and also finds the event’s participants, location, and date when they are expressed in text.

An event ontology consists of pre-defined event types organized according to their semantics.  The size and granularity of the ontology depends on the applications that Event Extraction is meant to serve.  The ontology could be hierarchical, where, for example, various business-related events such as Merger, Acquisition, Bankruptcy, etc. would be grouped under a higher Business category.

Events are defined semantically, and so the same event expressed in syntactically or lexically different ways would have the same event output. For instance, the following sentences all express the same Acquisition event with the same participants and date.

  • “XYZ Corporation acquired/purchased/bought ABC Corporation on March 12, 2018.”
  • “ABC Corporation was acquired/purchased/bought by XYZ Corporation on March 12, 2018.”
  • “XYZ Corporation’s acquisition/purchase of ABC Corporation occurred on March 12, 2018.”

Event Extraction would produce the same output for each of these sentence variants.

Why is Event Extraction Challenging?

As mentioned above, human language can express the same event with different lexical and syntactic choices.  Thus Event Extraction needs to be able to handle many lexical and syntactic possibilities.

Event Extraction also needs to perform coreference resolution, that is, understanding what pronouns (e.g., “he”) or definite noun phrases (e.g., “the company”) refer to, in order to be useful for end user applications.  The sentence “He bought it” doesn’t have a lot of information value by itself, but if the “he” and the “it” can be resolved to named entities, e.g., “John Smith” and “ABC Corp.” in the same document, then the Event Extraction output becomes much more useful. However, accurate coreference resolution remains a challenging AI problem.

Additionally, information about the same event may be scattered across a document. It’s necessary to apply a technique called “event merging” to arrive at the full, complete event. For instance:

“On Wednesday CVS finally closed its massive merger with health insurance giant Aetna. Analysts say that the $70 billion merger positions the company to create a brand new health care model centered on its CVS Minute Clinics.“

Here the relevant information to the event is contained in two separate sentences and Event Extraction will identify a single Company Merger Event, with the following participants:

  • Company: CVS
  • Company: Aetna
  • Value:  $70 billion
  • Date: Wednesday

Event Extraction needs to handle all the syntactic complexity and identify “the $70 billion merger” as a reference to the previously mentioned CVS-Aetna merger.

Why is Event Extraction Useful?

Event Extraction has many applications. It can be used for:

  • Link Analysis: Particularly in law enforcement and national security, a primary goal of analysis is to map out the connections among individuals and organizations. Event extraction allows links and connections arising from events (e.g., Who Met Whom and When) described in unstructured text data to be added to the link analysis.
  • Geospatial Analysis: Event Extraction, when events and their locations are also extracted, can overlay the events on a map. For example, documents that mention bombing events can be analyzed geospatially with Event Extraction. Event Extraction can provide far richer geospatial information than Entity Extraction alone can.
  • Biographical information: Relationship Extraction captures some attributes of a person: birthday, age, physical description, family, employment, etc. Event Extraction will tell you additional information about the person’s activities, such as where the person traveled, whom the person met, and so on.

Following are some business areas where Event Extraction is regularly applied:

  • Risk Management: A company can monitor and assess risks that their organization faces due to bad actors. Through monitoring world media, Event Extraction can notify a firm that there is some current adverse information regarding a potential supplier, e.g., having had fines imposed on them based on corrupt activities. Event Extraction can monitor in real time, so there is no need to wait for such corrupt actors to be placed on official lists such as OFAC.
  • Geopolitical Monitoring: Organizations can use Event Extraction to monitor political or military crises to generate real-time notifications and alerts as to what is going on in the affected region.
  • Intelligence Analysis: Event Extraction can be used to identify critical facts occurring in the vast streams of unstructured text data available. The text data is much too voluminous for manual analysis, and Event Extraction allows the automatic identification of key nuggets of information.

Other areas in which Event Extraction is valuable include law enforcement, the life sciences, and public health crisis monitoring.