Entity Extraction: A Must-Have for Due Diligence in a Global Economy

Entity Extraction, Identity Resolution, Risk Management, Social Media Analysis

Today’s global economy has created an exciting marketplace and opened new frontiers, but with those opportunities have come new challenges.  One such challenge is how to perform due diligence of prospective customers, partners, or hires coming from disparate and often geographically distant backgrounds.

This is of particular concern for organizations that are required by laws and regulations such as OFAC and FinCEN to be careful about whom they do business with.  That’s especially the case for the financial and insurance industries, which must guard against money laundering (AML), terrorist financing, and any other illicit uses that may make them an accessory to a financial crime and consequently the target of hefty fines, sanctions, and prosecution.  In addition, fraud is always a top concern for these and other industries as well as for the public sector.

One main way that organizations protect themselves is by performing due diligence and screening prospective and current customers, employees, partners, and vendors against watch lists, sanction lists, blacklists, and PEP (Politically Exposed Person) and KYC (Know Your Customer) databases.  If an entity is matched against an item on one of those lists, the organization can take appropriate steps which can range from not engaging with the entity through putting in place additional oversight/review processes.

Another main way organizations should protect themselves is by researching prospective customers, partners, and hires, and their associates looking for risk exposure.  News stories from around the world are a major source of valuable risk information. Social media can also be a great information source. Think about adverse news about a company or its company officials being indicted, arrested, or prosecuted for involvement in a crime such as bribery, corruption, or collusion. Certainly a financial company contemplating doing business with a compromised entity should be aware of the risk.

But with thousands of news sources and social media constantly producing new unstructured content that may be written in multiple languages, the problem is how to mine such a staggering amount of data in real time.

The Power of AI-based Entity Extraction to Mine Big Data

Let’s break this problem down. Getting access to the raw news and social media content is usually pretty straightforward, but what do you need to be able to analyze such worldwide data sources? In other words, what are the requirements to structure multilingual text-based Big Data in real time? To make it simpler, here is your top-4 list:

  1. Advanced extraction capabilities. Entity Extraction provides automatic identification of semantic concepts in text. A standard Entity Extraction product will identify a variety of named entities (e.g., people, organizations, places). Being able to recognize named entities is important and necessary but not sufficient to address the risk management challenge. To be able to find risk information, your Entity Extraction tool must also be able to identify advanced semantic concepts such as links and events. For instance, given a company of interest, link extraction will allow you to discover news involving people associated with it, perhaps high-ranking officials or even their relatives. More importantly, event extraction will reveal adverse information (e.g., arrests, indictments, crimes) about the company in question or its employees and associates.
  2. Scalable, real-time processing. Time is of essence in due diligence. Timely information is critical to make an informed decision. That means being able to process large amounts of data on short notice and/or in real time.
  3. Multilingual extraction. A global economy entails worldwide sources and multiple languages. Your Entity Extraction tool must be able to automatically identify the language or languages of a document and automatically invoke the appropriate language module for optimal natural language processing.
  4. Cross-document entity resolution. Often a person or company is referred to using name variations such as partial names (e.g., no middle name), acronyms, or even misspelled names. Name normalization in conjunction with Identity Resolution allows for the aggregation of information about the same real-word entity across documents, the automatic generation of profile-like reports, and other advanced features like automatic monitoring alert generation.

In summary, performing due diligence in today’s fast-moving global economy requires real-time mining of text-based Big Data from worldwide sources. NetOwl’s advanced, AI-based Entity Extraction provides the key capabilities required to meet your information demands in support of due diligence and ultimately risk management.