DataStitch: Secure, On-Premise DeDuplication
DataStitch, a Contactous product, is intended for organizations with large datasets in millions of records, that need to be cleaned through quality deduplication, without uploading the data to any external system or cloud.On-Premise Execution
|
Full Name - First Example Example of Name de-duplication taken from a medical institution in India. Combination of salutations, qualifications and swap of first name and surnames were considered:
Full Name - Second Example Example of Name de-duplication taken from a warranty registration database in Philippines:
Full Name - Third Example A powerful example of algorithm's capabilities. Example of Name de-duplication taken from a database of a South Asian country:
Full Name - Fourth Example Example of variations of a name considered in a suspected duplicate cluster:
Address - First Example This is one of the best example of Address de-duplication, highlighted by algorithm within a massive CRM database in India. Not only there are inconsistent abbreviations and spelling errors, the old and new official name of the city has been detected as duplicate:
Address - Second Example An example of Address de-duplication, from Singapore:
Address - Third Example An example of duplicate address cluster from Australia. Note the abbreviations and variations of state name captured in duplicate cluster:
Mobile Numbers The algorithm finds mobile numbers in multiple formats and groups the duplicate together. Here's an example of such a group:
Company Name - First Example An example of Company Name de-duplication from Philippines:
Company Name - Second Example Another similar example of Company Name de-duplication, from India:
Name + Mobile Number Example of 5 duplicate Name and Mobile Number combinations as found by system: Name: Mohammad Kasim Mobile: +91-98336-90611 Name: Mohd. Kasim Mobile: 0091 98 33 69 06 11 Name: Mhd. Kasim Mobile: (9833) 690-611 Name: Md. Kasim Mobile: 0-98336-90611 Name: Muhammad Kasim Mobile: 9 8 3 3 6 9 0 6 1 1 Name + Date of Birth Example of 4 duplicate Person's Name and Company Name combinations found: Person's Name: Narendra Bajpayee Date of Birth: 15/11/1984 Person's Name: Narindir Bajpayee Date of Birth: 11-15-1984 Person's Name: Narender Bejpeyee Date of Birth: 15.11.84 Person's Name: Nariinder Baajpayii Date of Birth: 1984, novembr 15 Name + Company Example of 4 duplicate combinations of Person's and Company Names as found by DataStitch: Person's Name: Sanjiv Kumar Company's Name: HPE India Private Limited Person's Name: Sanjeve Kumarr Company's Name: HPE India Private Ltd. Person's Name: Sanjeev Qumar Company's Name: HPE Pvt. Ltd Person's Name: Sanjive Koomar Company's Name: HPE Limited Website URL DataStitch groups different Website URLs which refer to the same page in a single cluster. Here's an example of such a group:
|