Entity Extraction for Faster, Cost-Effective, and Accurate e-Discovery

Enterprise Search, Entity Extraction, Intelligence Analysis

Entity Extraction for Faster, Cost-Effective, and Accurate e-Discovery

Electronic discovery, or e-Discovery, is the process of discovery of electronically stored information (ESI) related to legal proceedings. The e-Discovery process involves several stages, including identification, preservation, collection, processing, and review of information. Each stage presents its own unique challenges, but overall the two competing forces are cost and effectiveness. On the one hand, the ultimate goal is to produce evidence to meet a burden of proof. On the other hand, it is not unusual for e-Discovery to involve massive amounts of ESI, often in the form of unstructured data in various file formats and on the order of millions of documents. Manually reviewing such large volumes of unstructured data is prohibitively expensive and impractical both on the production side and the consumption side of the discovery process. The challenge in both cases is how to narrow down a large collection of ESI to a much smaller set of relevant documents for analysis.

How does Entity Extraction help e-Discovery?

Entity Extraction can play a critical role in preparing raw ESI for review and in identifying relevant information.

First, on the production side, Entity Extraction can be used to identify privileged or sensitive information, which must be protected during the e-Discovery process. Sensitive information such as social security numbers must be redacted before ESI is made available for review. Entity Extraction can recognize social security numbers, account numbers, and other sensitive information with very high accuracy and use this information to produce cleansed or redacted versions of those documents

Second and more important on the consumption side, Entity Extraction can help identify relevant information. ESI starts as raw data. If the raw data is more than a few hundred items, it should be turned into an indexed, fully searchable collection. ESI may come with metadata (e.g., to, from, and date information in email messages). This metadata can play an important part in identifying relevant data and providing evidence for a given case, but it is often not sufficient or just not available. Entity Extraction can provide or augment metadata and thus turn unstructured data into searchable information. In its most basic form, Entity Extraction is about automatically recognizing names of people, organizations, and places, time expressions, and various numeric expressions such as monetary amounts. Entity Extraction can be used to augment any metadata associated with electronic documents to include key concepts such as the names of the companies and people mentioned in those documents. It is important to realize that Entity Extraction does not just identify known names. Its true power is to use linguistic context to identify previously unknown and unseen names, which may play a critical role in the legal case.

What Makes NetOwl’s Entity Extraction Unique for e-Discovery?

There are a number of ways in which NetOwl’s Entity Extraction offers a unique and critical advantage for e-Discovery:

  1. NetOwl not only identifies named entities with state-of-the-art accuracy, but offers a unique and advanced capability to identify a broad range of links (e.g., person-associate, person-affiliation), events (e.g., meetings, payments, travelling), and their participants out of the box. This link and event extraction capability allows for a far more advanced analysis beyond the simple links afforded by co-occurrence or analysis of the To/From/Cc fields from emails or other internal communications. NetOwl enables a deeper analysis of documents such as network link analysis to reveal clues critical to a given investigation. Furthermore, for specialized domains NetOwl’s Entity Extraction can be easily customized to extract additional concepts of interest (e.g., oil rigs for the oil industry).
  1. NetOwl normalizes names so that they can be more easily resolved and aggregated across documents to support semantic search, faceted search, and advanced analysis (e.g., timelines, charts).
  1. NetOwl is engineered specifically for high-volume processing of multiple different data sources, which is critical in investigations and legal proceedings where time is of essence and costs money. Additionally, NetOwl integrates document converters to handle hundreds of native document formats.
  1. NetOwl integrates easily with databases, document and content management systems, portals, and other sources of electronic content.  Its REST API supports easy integration into existing workflows.  NetOwl can be deployed on premise on in the cloud for horizontal scalability, offering rapid processing of massive amounts of data.

In summary, NetOwl is the best Entity Extraction choice for e-Discovery applications. To learn more about our Entity Extraction software, contact NetOwl today.