How Name Matching Is Critical to Prevent Financial Fraud

Name Matching, Record Management, Risk Management

Large financial institutions need to guard against, among other things, fraudulently submitted loan applications where there is no intention to repay from the outset. One way to do this – in addition to performing due diligence against official lists such as OFAC – is to try to match a new client being on-boarded against the financial institution’s previously confirmed fraud cases. Another valuable strategy is to attempt to match a newly identified fraud case against the entire customer base of the bank to further attempt to identify repeat offenders. In the case of some very large institutions, the number of existing and past clients could reach into the tens of millions.  In addition to large numbers of customers, the matching needs to be done over multiple elements in the client data. Not only is the name of the applicant relevant, but matching is needed on other elements, such as the address, date of birth, phone numbers, etc., in order to be certain of the match. After all, there are many individuals with the same or a similar name.

The number of fields containing data in the client applications can be quite large: up to 10-20 fields would potentially need to be matched. Obviously, getting all this done in a timely fashion as part of the client on-boarding process would simply be beyond any checking done exclusively by humans.  Also, it’s not obvious how a human analyst would be able to consistently make decisions as to what constitutes a good match given the large number of data fields involved, e.g., name, address, date of birth, phone number, etc.

Why Name Matching is Hard

 The underlying challenge is that matching such data is hard. In the case of personal names, they offer the following challenges to successful matching:

  • Misspellings
  • Word order variations (e.g., John Gibbons vs. Gibbons, JohnJohn being a common first name is an obvious tip-off as to the right answer, but how about Thomas James vs. James Thomas? Which is the first name?)
  • Initials (e.g., John Smith vs. J. Smith; John F. Smith vs. John Smith)
  • Nicknames (e.g., Edward/Edmond/Edwin vs. Ed)
  • Some names sound alike but are spelled differently (e.g., Sean vs. Shawn vs. Shaun).
  • In application forms that break out the name into multiple fields First Name, Middle Initial, Last Name, what do you do when the applicant leaves out the middle initial or, even worse, enters the first name and middle initial in the first name field?
  • In a globalizing world, the matching has to succeed over many different ethnic types with different naming conventions (e.g., Mahmud bin Muhammad al-Ahmad vs. Mahmud al-Ahmad). In addition, since many of these names were originally written in a different script representing different sounds than Latin, there can be a very large number of alternative spellings of the same name in English (e.g., Bashir al-Assad vs. Bachar al-Assad).

As opposed to names, items like date of birth may sound easy to match, but here too there are difficulties. Going beyond the European (Day/Month/Year) vs. American (Month/Day/Year) convention, there is also the question of how to judge how close a match is: 1946 may seem a long way from 1956, but in fact it only differs in one digit and may be a simple typing error.  Should this match necessarily be seen as less close than 1945 vs. 1946?

Names of companies offer their own special challenges for matching. Here are a few:

  • Abbreviations are common: International Business Machines vs. IBM
  • The name may be truncated: Alison Aircraft vs. Alison
  • The presence or absence of corporate designators needs to handled: Jones Tires, Inc. vs. Jones Tires

Dates exhibit their own peculiarities:

  • The ordering of the component parts of dates can vary: September 10, 2014 vs. 10 September, 2014
  • Day names may be included: Monday, Sept. 7, 1945 vs. Sept. 7, 1945
  • Years and names of days and months may be abbreviated: 01/01/2019 vs. 01/01/19, Monday vs. Mon, January vs. Jan.

Addresses have problems too:

  • 6901 7th Street, Scarsdale, NY vs. 6901 Seventh St., Scarsdale, New York (three differences)
  • Postal codes may be present or absent.

The above is only a sample of the variations that effective Name Matching has to handle.

How Name Matching Technology Can Protect Against Financial Fraud

To be an effective tool against financial fraud, any Name Matching product should exhibit the following features:

  • Handles all the types of name matching challenges described above
  • Provides highly accurate matching, thereby minimizing false positives (i.e., not making bad matches) and false negatives (i.e., not missing good matches)
  • Applies Machine Learning and Artificial Intelligence to real world name variant data to achieve high accuracy
  • Returns a list of matches ranked in terms of similarity scores, providing the users with the capability of experimenting with the score thresholds
  • Handles Big Data, meaning that it can match large numbers of names in the form of queries against a very large database of names. This allows real-time matching for critical business applications
  • Allows a user to tune the matching behavior. For instance, a user may assign different weights on different fields – a higher weight on names and a medium weight on addresses, and so on, according to each organization’s business rules.

In sum, state-of-the-art Name Matching provides a critical weapon in the fight against fraud. It will uncover incidents of attempted fraud in an accurate and timely fashion.