How Name Matching is Crucial for Automated Identity Verification

What Is Identity Verification?

In today’s digital world, there is an increasing need for automated Identity Verification to prevent fraud and account impersonation.

Identity Verification is the process of verifying the data that an applicant or customer has provided by checking it against identity documents such as driver’s licenses, passports, national ID cards, military ID’s, etc. to confirm an individual’s identity.

Identity Verification is critical for many different applications:

Social media account verification (e.g., Facebook, LinkedIn) to prevent impersonation
Bank account opening
Loan applications to banks and other financial institutions
Customer or employee onboarding by firms
Requests for Government services such as Social Security and Medicaid/Medicare
etc.

The verification process typically involves the following steps:

The customer or applicant uploads a photo of a driver’s license or other identity document.
The text fields of identity documents are digitized with optical character recognition, and personal information is extracted such as name, address, date of birth, etc. from the document.
The personal information on the identity document is matched against what the applicant has submitted.
Some companies offer the additional capability of further matching against databases of individuals with a criminal record or some other restriction (e.g., sanction lists, internal blacklists, KYC/AML regulations)
Edge cases are sent to manual review if desired.

A Critical Step in Identity Verification Is Matching Personal Data

A major challenge in Identity Verification is that names and other personal data (DoB, address, etc.) can legitimately vary between what an applicant submitted and what is in the identity document for many reasons. For instance, person names can vary in multiple ways:

Simple misspellings and spelling variations, including names that sound alike: Joan Smith vs. Joan Smythe; Bill Stuart vs. Bill Stewart
Initials: John Edward MacNally vs. J. E. MacNally
Missing name elements: John Frederick McArdle vs. John McArdle; Tareq al-Nasir vs. Tareq Nasir (“al,” the Arabic definite article, is frequently used in names but is also frequently left out)
Nicknames: William McAllister vs. Bill McAllister vs. Billy McAllister
Name order variations: Kim Ji-su vs. Ji-su Kim (Asian names traditionally have the surname come first, but they sometimes exhibit the Western order.)
Transliteration variants: Muhammad al-Khalidi vs. Mohamed el-Khalidi (Given that Arabic is written in a script different from Latin, it has to be transliterated into English, and there is no single transliteration standard. Consequently, differences in English spelling of an Arabic name often occur.)

What makes name matching particularly difficult is that several types of variations can occur in the same name. For instance: William Patrick McAulliffe vs. Bill MacAuliffe.

For other examples of types of name variation, see here.

Matching Names Across Different Writing Systems Is Challenging

In a global economy, names in identity documents may come in a non-Latin script that has to be matched against the Latin version, or vice versa.

Imagine a bank in the Middle East that needs to process a transaction and the customer’s required identity documents have to be verified. The documents are in Arabic script, but the transaction application may be in Latin-based characters:

أحمد أمين vs. Ahmed Amin

This situation occurs with other languages:

Wang Lee vs. 王毅 (Chinese)
Alexander Ovechkin vs. Александр Овечкин (Russian)
Park Bo-gum vs. 박보검 (Korean)
Kento Yamazaki vs. 山崎賢人 (Japanese)

In all these cases, the names refer to the same person but are in different writing systems.

For more information on the challenges of cross-language name matching, see here.

A Machine-Learning-Based, AI Approach to Fuzzy Name Matching

The challenges posed by Identity Verification can be addressed by a technology called Advanced Fuzzy Name Matching. Advanced Fuzzy Name Matching is an AI technology that employs intelligent machine learning algorithms to automatically learn a very large collection of probabilistic name matching rules from real-world, large-scale name variation data.

Since the rules are learned from actual data, they are not constrained by humans’ limited knowledge of possible name matches. They reflect countless name variants that occur in the real world, allowing this approach to produce more accurate matching.

Advanced Fuzzy Name Matching can also process names at scale in real time, enabling it to reach the matching speed required for document verification.

In the case of cross-language name matching, the matching is performed directly, with no need to translate non-Latin scripts into Latin before performing the matching. This avoids the problem of mistakes introduced by the translation process. Matching directly between the names in different scripts achieves higher accuracy.

Advanced Fuzzy Name Matching also provides a similarity score that can be used to set thresholds for match, no match, and an in-between weaker match. If desired, the last can be evaluated by humans to determine if it is a match or not.

Summary

Advanced Fuzzy Name Matching is a technology that provides fast and accurate matching. It allows organizations to confirm that the individual requesting a service is actually who they say they are.

Previous Next