Identity Resolution and the Customer Stitching Puzzle

Identity Resolution, Name Matching, Record Management

In today’s world, consumer-focused companies such as large retailers gather lots of consumer data in an effort to better understand and serve their customers. It all starts with consumers interacting with businesses, often through a variety of channels, apps, stores, and transactions. These interactions contain valuable information about consumers and their preferences. The problem is that each interaction may result in a separate consumer record. For customer data to be truly useful, we must first figure out what records represent the same consumer. Only when information from multiple records is consolidated into unique consumer profiles, is the data truly valuable. Consolidated consumer records translate into better customer care, more relevant, engaging, and effective customer-centric marketing — the so called people-based marketing —, and more accurate data for market research.

The challenge is how to detect among millions or hundreds of millions of records which ones represent the same real-world person both accurately and efficiently. Records for the same customer may share some identifier like a name or mailing address, but those identifiers may not be unique enough (e.g., the classic “John Smith” problem), or may be partial (e.g., with only a first name or last name), or may mismatch due to nicknames, missing name components (e.g., a missing middle name), typos, word order variations, and different language scripts (e.g., international travelers).

Customer Stitching

Having unified customer records often requires stitching together data from various sources. A customer may have inquired about a product on one channel and may have purchased a related product on a different channel like a store. By stitching together both interactions, your business can get a better, more complete understanding of customers and the type of products they care about. This is key information for effective customer outreach. At the larger picture level, customer stitching allows your business to identify your various customer personas and build more engaging marketing campaigns.

Identity Resolution, a Key Technology for Customer Stitching

Identity Resolution is about figuring out what records represent the same entity, in this case the same consumer. For instance, a business with a master repository or index of known customer identities can search any ‘new’ customer against its index. If a match with a strong similarity score is found, the new record can be automatically merged with the master record. If no match is found, a new record can be added to the index.  If desired, a human can be brought in the loop for cases with fairly strong but inconclusive matches.

In order to deduplicate and stitch consumer records together, an Identity Resolution system must have the following properties:

  1. High accuracy. Merging records requires performing intelligent matching of various fields, including not just names but also other key attributes such as date of birth, place of birth, and address. The confidence of the match is determined by the combination of evidence from multiple attributes. Those multiple attributes typically involve various entity types, such as people, organizations, places, and addresses. For example, accurately matching information about a specific individual may rely on knowledge about their phone number (numeric), home address (address), date of birth (date), place of birth (place), spouse (person), child (person), employer (organization), or education (organization).  Records from different sources may have different subsets of these potential fields.
  2. Scalable and real-time. Consumer records are a true Big Data problem. Identity Resolution must support scalable, real-time searching of massive databases with hundreds of millions of records. NetOwl can match new records against large quantities of existing records in real time.
  3. Tunable. Application-specific business rules determine what combination of record attributes should be matched and how important each attribute is to the overall matching score.
  4. Foreign script matching. NetOwl’s Identity Resolution leverages its award-winning machine learning-based, multicultural, multi-lingual name matching product to enable sophisticated name matching of various entity types across different languages and language scripts (e.g., Latin, Chinese, Cyrillic, Arabic).

NetOwl’s AI-based Identity Resolution software provides a high-accuracy, scalable, fast, and tunable solution to match and help merge customer records from multiple sources. It allows companies to dedupe their customer data to produce a high-quality customer database, thus enabling more reliable insights and effective marketing, customer care, and market research.