- A computer distribution company in Americas running Salesforce CRM with 5.2 Million records successfully consolidates the CRM of its acquisition in Asia which had 1.0 Million contacts in its legacy system, within a month.
Chained Duplicates: Connecting Records for Single Version of Truth
Contactous' Enterprise Data Hygiene (EDH) is being used to find common patterns across disjointed data clusters and stitch them together to give a 360 degree view of the record. It is best explained by an example. Assume that there are 5 datasets with an average of a million records in each, with some overlapping fields between them. EDH is able to chain these duplicate records out of the millions and create a unified view within seconds.
In this example, EDH has created these 5 records in a cluster from 5 million records within seconds. It does this by:
- Matching name and then name + date between 1st and 4th record. It does a double check by matching state and country too. It now has a mobile number (from 4th record).
- Using the mobile number, EDH then chains 2nd record and gets email and ID number.
- Connecting 2nd and 3rd record with email address follows.
- EDH goes back to output of Step #1 and uses the twitter handle to connect to 5th record.
- The result is on the right - A unified data record from 5 disconnected datasets. The record is also cleaned and standardized as shown in fields of date and address.
An actual use case of Chained Duplicates
The algorithms of EDH to link duplicates towards a unified view are used to chain records at a hospital for patient data. In the illustration below, EDH has found 4 records from 4 different datasets and then consolidated them into a specific one.