How Entity Extraction Has Revolutionized the Legal Industry

Enterprise Search, Entity Extraction

Machine Learning and AI are impacting the legal profession in profound ways, perhaps most visibly in the areas of automated text and language analysis, commonly known as Text Analytics.  In an environment that is filled with documents containing unstructured data – contracts, court rulings, case law, written evidence, etc. – new Text Analytics technology that automates the understanding of large volumes of text opens exciting vistas.  The discovery process, often called e-discovery, is the first area in the legal domain that has seen revolutionary shifts, where Entity Extraction and Categorization technologies are supporting attorneys and paralegals in the task of sorting and categorizing large numbers of case-relevant documents.  A task that used to be entirely manual, time-consuming, and expensive can now be accomplished at a fraction of the cost.

Another legal area that is adopting new Text Analytics technology is that of Enterprise Legal Information Providers.  Here too Entity Extraction is playing a central role.  Entity Extraction identifies key concepts in unstructured data – which include named entities as well as links and associations between named entities.  Entity Extraction also classifies these concepts into what kinds of things they refer to:  it knows that “Mary Bates, Esq.” is a person and an attorney, while “The Honorable John J. Smith” is a person and a judge.  Entity Extraction for the legal domain needs to extract people, including attorneys, clients, judges, witnesses, expert witnesses, defendants, etc., as well as organizations such as courts and law firms and their addresses.  In addition, there are many specifically legal items that need to be extracted, such as citations like Griswold v. Connecticut, 381 U.S. 479, 480 (1965) and case names such as Rogers vs. Smith.

The real power of Entity Extraction, however, is that it goes beyond such basics as names of people and organizations.  For example, it extracts the legal specialties of attorneys and also such facts as which courts they may have argued before.  Entity Extraction also extracts the links and associations among people and organizations.  It knows, for example, that Attorney X works for Law Firm Y.  In addition, it can even capture the Who, What, Where, and When of legal events such as illustrated by the following sample sentence:

“John Robertson was convicted on July 13, 2013 of tax fraud in the Southern District of New York in a trial presided over by Judge Robert Mayes.”

From this sentence Entity Extraction extracts:

  • Event: Felony Conviction
  • Date of Event: July 13, 2013
  • Defendant: John Robertson
  • Crime: Tax Fraud
  • Location of Conviction: Southern District of New York
  • Presiding Judge: Robert Mayes

All similar information from a vast number of legal records is likewise processed by Entity Extraction and output in a structured, predictable format suitable for storage in a database.  This structured data can then be leveraged by Enterprise Legal Information Providers to provide enhanced legal information to their customers.­­

For example, once structured, such complex, rich legal information allows for the deployment of a sophisticated semantic search capability that makes it possible for legal professionals to conduct better, faster, and more relevant research.  This capability is light years ahead of the typical keyword search found in conventional search engines.  For example, from the above example, semantic searches are now possible over the large amounts of now structured and aggregated data such as:

  • What other cases has John Smith been a defendant in?
  • What were the crimes in those cases?
  • What other trials has Judge Robert Mayes presided over?

All of the data that answers these questions was extracted by Entity Extraction in a completely automated fashion.

This new capability employing Entity Extraction enables many different possible uses, of which some are:

  • Attorneys may now have the complete, up-to-date professional backgrounds of opposing attorneys delivered to them digitally, since Entity Extraction has constructed from public records the complete professional histories of all U.S. attorneys, including their expertise areas, previous cases, etc.
  • Law firms may also use this information when looking for outside counsel in specific areas of the law that they do not have covered in-house.
  • Since data that has been structured from unstructured data is easy to input to existing link-analysis tools, it is possible to construct navigable networks of linked citations with characterizations of whether a citation is being followed, overruled, or whether any other semantic relation exists between the document and the citation it contains.

In the last few years, the introduction of Entity Extraction has provided Enterprise Legal Information Providers with revolutionary enhancements of their ability to deliver a large quantity of on-point and up-to-date legal information to their customers.  It is fair to say that the more mundane tasks of the legal profession – and increasingly the more sophisticated ones – are being automated at a rapid rate.