Identity Resolution for Identity Management

Identity Resolution, Record Management, Risk Management

Identity Resolution for Identity Management

Tracking Who Has Access

A critical requirement for organizations in these days of strict data protection regulations and costly data breaches is to have good control over who has access to their systems and data. Strong governance that utilizes identity resolution technology is increasingly necessary to a secure organization, and it also supports compliance with auditors’ requirements.

The Employee Population Is Constantly Changing

Change is part of every organization’s life cycle. People take on new responsibilities and new roles. They need access to different applications and systems to perform their jobs. It’s therefore crucial for an organization to ensure that they not have access that is no longer appropriate. This is essential for reducing the insider threat.

Determining the True Identity of Someone

An important piece of this puzzle is determining the true identity of whoever has access within the organization and what they have access to. For instance, access to sensitive data such as health records needs to be tightly monitored both to safeguard patients’ records as well as to comply with authorities’ compliance regulations. In some organizations, persons can access the internal systems with different personas – for example, in an academic medical center someone may appear as a nurse in one database but may also be a nursing student according to another internal database. An organization needs to know that they are the same person, so their activities and access to sensitive data can be regulated appropriately.

How Identity Resolution Confirms Identities

Suppose an organization possesses databases of records with information on its employees, students, contractors, and clients. The record fields may contain:

  • First and last names (sometimes a middle name or initial)
  • Social security number
  • Email address
  • Home address.

The fields that actually get filled may well be different for the three classes of persons. The organization may always have a social security number for employees, but maybe not for students and contractors, so it can’t be taken by itself as sufficient for establishing identity.

What is needed is software that can match over all the fields, providing an individual score of likelihood of a match for each field, and then generate a combined likelihood score for all fields taken together.

Why Identity Resolution Is Challenging

Many fields offer their own challenges:

  • Names can vary: “John Bennett” vs. “J. Bennett” vs. “John S. Bennett.”
  • Spelling variants, nicknames, and simple typos may be common in the data: “Elliott” vs. “Eliott” vs “Elliot” vs. “Eliot” vs.“Elyot.”
  • Names of different ethnicities have their own idiosyncrasies:
    • Spanish names contain both patronymic and matronymic surnames, but the latter is often dropped: “Juan Ramos Guzman” vs. “Juan Ramos.”
    • They may have been typed with proper diacritics and unique letters or simplified: “José Castañeda” vs. “Jose Castaneda.”
  • Items such as dates need to be handled appropriately. The ordering of the pieces can vary:
    • “January 1, 2013” vs. “1 January, 2013“
    • “10/01/2017” vs. 01/10/2017 (U.S. vs. European)
    • ”July 3, 1938” is obviously a pretty close match to “July 3, 1939,” but “October 3, 1944” is also a close match to “October 3, 1954.” The apparent 10 year gap could be the result of a simple typo involving one digit. The matching needs to be sensitive to these kinds of differences.
  • Addresses offer some complex matching challenges:
    • “7735 8th Street, Columbia, NY” vs. “7735 Eighth St. Columbia, New York.” Here there are three differences that need to be handled.

How Identity Resolution Works

Identity Resolution, aka Entity Resolution, is a technology that handles all of these phenomena to find records that represent the same identity. It should not only provide highly accurate matching for a given record against a dataset(s), but also an intelligent way to efficiently find similar records (clusters) within a dataset(s). Using machine learning techniques, some advanced Identity Resolution products apply different models to each entity type that are optimized to resolve identities. These models are based on a very large amount of data containing variants that occur in the real world. The Identity Resolution algorithms use this data to build the identity models.

Summary

It is critical that organizations achieve robust identity management. Identity Resolution is a critical piece of the puzzle.