Identity Resolution Facilitates the Creation of Electronic Health Records

Identity Resolution facilitates creation of Electronic Health Records

Electronic Health Records Promote Better Health Outcomes

It’s pretty clear that universal Electronic Health Records (EHRs), in which every citizen of a country has a health record linked to a national system, promote better health outcomes. They also promote higher levels of trust in the medical system. Medical data would also yield greater value for technology like Artificial Intelligence if the data was as comprehensive as it would be with a national EHR.

Since the passage of the Affordable Care Act in 2010 in the US, there has been rapid progress towards the introduction of Electronic Health Records (EHRs). Although few people in the US think that national EHRs are a near- or even medium-term prospect, it was hoped that EHRs would contribute to more coordinated care among the various health care providers in a limited geographical area. This would reduce duplication of medical tests, avoid errors due to one provider not having the information that another one has, and overall allow for the smooth and easy exchange of critical information.

Even though EHRs have proven to be an improvement in the US, doctors and hospitals still have difficulties in sharing data. Patients go to different providers, and it is still not easy to link records across different IT health systems. Converting one system’s data to be compatible with another’s is a hard problem.

Identity Resolution Can Improve Medical Data Sharing

Fortunately, there is a technology that will support the linking of data records and so facilitate the development of accurate and complete EHRs: Identity Resolution (also known as Entity Resolution).

Identity Resolution identifies the variations that occur in data elements across different records. Record fields typically consist of:

First and last names (on occasion a middle name or initial)
Date of Birth
Home address
Home and Mobile phone numbers
Email address
Insurance ID
etc.

Linking Patient Records Can Be Tricky

Each field of a patient record offers its own challenges:

Names vary: “Jim Baker” vs. “J. Baker” vs. “James R. Baker.” Phenomena like nicknames, initials, and simple typos may be common in the data.
Dates need to be handled in accordance with their characteristics, e.g., the ordering of the pieces may vary:
- “October 9, 2017” vs. “9 October, 2017“
- “10/01/2017” vs. 01/10/2017 (U.S. vs. European)
- The nature of what’s considered a close match may vary, e.g., “August 4, 1938” is a pretty close match to “August 3, 1938,” but “January 3, 1961” is also a close match to “January 3, 1971.” The apparent 10 year gap in the latter could be caused by a simple fat-fingering typo. The matching has to take these kinds of phenomena into account.
Addresses are quite complex, e.g.,
- 7735 8^th Street, Columbia, NY 01923 vs. 7735 Eighth St. Columbia, New York 01923-3494.

There are four differences here that need to be handled (including the “short” form zip code versus the “long”).

In order to establish that two records refer to the same individual, it’s necessary to first match each of the above elements and provide a score for how close the two fields are.

In addition, it is necessary that there be a way to take similarity scores of each field, combine them according to business rules into a single score, and use that score to determine if two records belong to the same individual or not.

Here are some examples of patient records that show typical variations in the data:

Name	DoB	Address
James Baker	10/09/71	45 Maple St., Brentwood, VA
Baker, Jim	Oct. 9, 1971	45 Maple Street, Brentwood, Virginia 22093

Name	DoB	Address
Margaret L. Jones	11/3/1990	6 Park Lane, Hialeya, ME
Maggie Jones	November 3, 1990	6 Park Ln, Hialeya, Maine 01923

Name	DoB	Address
Rashid Abdurrahman	3 March, 1995	4 Emory Court, Louisville, MN
Rachid ‘Abd al-Rahman	Mar 3, 1995	Four Emory Ct., Louisville, Minnesota

Name	DoB	Address
Jose A. Benitez Artola	3/4/1979	2134 Raspberry Dr., Olney, MD
Pepe Benítez	3 April 1979	2134 Rasberry Drive, Olney, Maryland

How Identity Resolution Offers Highly Accurate Record Linking

In Identity Resolution, any pair of patient records are first compared with AI-based highly accurate Fuzzy Name Matching, which handles a wide spectrum of variations in person names, addresses, dates, phone numbers, etc. Then all records are clustered (or linked) according to their similarities calculated by Fuzzy Name Matching using a very efficient clustering algorithm that can handle a massive amount of records. Each resulting cluster represents a real individual in the world and is assigned a persistent ID. As new records become available and are added into the system, Identity Resolution determines whether they belong to the existing clusters or they are new patients, in which case new clusters with new IDs are assigned. The clustering algorithm assigns a score to each cluster, which indicates how closely the records in that cluster match each other and thus allows users to make tradeoffs between recall and precision based on their particular use cases.

In sum, it may be a while before the US gets to universal EHRs, but it can derive great improvements from the use of Identity Resolution to enhance the sharing of medical information.

Previous Next

Identity Resolution Facilitates the Creation of Electronic Health Records

Electronic Health Records Promote Better Health Outcomes

Identity Resolution Can Improve Medical Data Sharing

Linking Patient Records Can Be Tricky

How Identity Resolution Offers Highly Accurate Record Linking

CATEGORIES

Recent posts