Data FingerprintingAccelerating Data Mapping and Unification using Fingerprints

Modak’s Data Fingerprinting provides a index value which differentiates between other record values, this value is called as fuzzy value and index is called as fuzzy index these are called as fingerprints of the data which are unique in nature, these values are used to match a similar leaves of a branch.

Why is Data Fingerprinting useful?

In this process, the comparison of column values is done across different tables and a hash code against the column is generated. Irrespective of what the column name is labelled across different tables, if the column shares the same data, then a score will be generated from 0 to 1 as how much of data is matched and then the mapping of the data will be done and the data will be merged. This score will be generated using an algorithm.

For example, if there are different tables where the column is labelled as “col”, “column”, “col1”, but the data which is shared in the columns are the same, then the data is checked, a hash will be generated against that column, a score between 0 to 1 is generated and then mapping of the data takes place by merging the columns.