Name Matching for Safer Borders

Homeland Security, Name Matching

Likely anybody moving through the airport has experienced long lines – with agents checking the passport against the boarding pass, comparing the passport photo with your face, examining the data on the passport closely, especially the expiration date. The process can be very tedious, but it’s important that countries keep the wrong people – terrorists and traffickers in particular – out at the border, including airports, sea ports, and land ports of entry.  It’s also important, however, that countries allow a free flow of the people they do want coming in – business people, tourists, students, and others.  Balancing these two requirements is a top priority, but it is not an easy task.

There are technologies that can help, such as biometric fingerprint identification. Another one, which is perhaps not as widely appreciated, is Name Matching.  It’s certainly of great importance: Tamerlan Tsarnaev, the primary conspirator in the Boston Marathon bombing, managed to avoid the attention of authorities because his name was misspelled in a database.

Name Matching may seem pretty straightforward if the name is “John P. Doe.” A missing middle initial as in “John Doe” or a full middle name like “John Peter Doe” produces an inexact match, but if the other information syncs up such as date of birth, then it would appear to be a sufficiently precise match.

But there’s a greater source of difficulty: what if a passenger’s name comes from a different cultural milieu? What if the name is Abdul Aziz on the boarding pass and Abdul Aziz bin Ahmad on the passport?  How confident can the border agent be that these are the same person?

Name Matching Challenges

There are many challenges in matching names from different cultures.  Some of them include:

  • Transliteration Variants: Arabic names are obviously natively written in Arabic script, as with a common male given name محمود. A problem arises for matching when this name is transliterated into English. محمود can come across the language barrier written in more than one way: Mahmoud, Mahmud, Mehmoud, Mehmud, and undoubtedly some others. This complicates life for a name matching system. And this situation obtains for any language whose native script is non-Latin.
  • Name Order Variants: Names in Asian cultures tend to come in the order Family Name/Given Name, the opposite of the Western custom.
  • Nicknames: Mikhail/Misha, Alexandra/Sasha
  • Name Division Variants: Deng Xiaoping vs. Deng Xiao Ping
  • Misspellings, whether accidental or intentional

These are just a few of the phenomena. In addition, any name can show any combination of the above variant forms, thus making the problem much worse. For example, Park Sol Mi is the same name as Solmi Bak (Sol Mi = Solmi, Park = Bak).

How Name Matching Helps

Advanced Name Matching meets all of the above challenges. Based upon Machine Learning and Artificial Intelligence, it automatically learns a set of probabilistic name matching rules from a collection of real-world matches that are known to be correct. It has the ability to generalize these known matches to other unknown ones. It can match names accurately that a human would not think were good matches.

In addition, advanced Name Matching handles cross-language name matching well: it learns to match directly between the Latin representation of a name and a non-Latin one. It does not require – as some other approaches do – that the foreign scripts be first transliterated to Latin, as this can generate transliteration errors and degrade the matching.

Finally, advanced Name Matching scales well. It can handle the very high throughput required by the task of matching a name against a database of millions of names, as some customers require, and do it in real time.