DataStitch is a Do-It-Yourself data preparation application. It enables non-technical users to consolidate disconnected data in CSV files and create clean usable datasets without external help. |
You have successfully collected data. Millions of records in multiple formats resides in disconnected files - coming from multiple systems, regions, sources and timelines. DataStitch makes them usable by cleaning and connecting them.On-Premise Execution
|
Full Name - First Example Example of Name de-duplication taken from a medical institution in India. Combination of salutations, qualifications and swap of first name and surnames were considered:
Full Name - Second Example Example of Name de-duplication taken from a warranty registration database in Philippines:
Full Name - Third Example A powerful example of algorithm's capabilities. Example of Name de-duplication taken from a database of a South Asian country:
Address - First Example This is one of the best example of Address de-duplication, highlighted by algorithm within a massive CRM database in India. Not only there are inconsistent abbreviations and spelling errors, the old and new official name of the city has been detected as duplicate:
Address - Second Example An example of Address de-duplication, from Singapore:
Address - Third Example An example of duplicate address cluster from Australia. Note the abbreviations and variations of state name captured in duplicate cluster:
Mobile Numbers The algorithm finds mobile numbers in multiple formats and groups the duplicate together. Here's an example of such a group:
Company Name - First Example An example of Company Name de-duplication from Philippines:
Company Name - Second Example Another similar example of Company Name de-duplication, from India:
Name + Mobile Number Example of 4 duplicate Name and Mobile Number combinations as found by system: Name: Mohd. Kasim Mobile: 0091 98 33 69 06 11 Name: Mhd. Kasim Mobile: (9833) 690-611 Name: Md. Kasim Mobile: 0-98336-90611 Name: Muhammad Kasim Mobile: 9 8 3 3 6 9 0 6 1 1 Name + Date of Birth Example of 4 duplicate Person's Name and Company Name combinations found: Person's Name: Dr. Nicholas A. Beck Date of Birth: 15/11/1984 Person's Name: Nicholas Albert Beck Date of Birth: 11-15-1984 Person's Name: Nicholas A Beck, PHD Date of Birth: 15.11.84 Person's Name: Mr. Nicholas Albert Beck Date of Birth: 1984, novembr 15 Website URL DataStitch groups different Website URLs which refer to the same page in a single cluster. Here's an example of such a group:
Manage Duplicate Records
|