Modak’s Data Fingerprinting provides an index value that differentiates between other record values; this value is called as fuzzy value and the index is called as fuzzy index. These are called fingerprints of the data which are unique, these values are used to match similar leaves of a branch.
In this process, the comparison of column values is done across different tables and a hash code against the column is generated. Irrespective of what the column name is labeled across different tables, if the column shares the same data, then a score will be generated from 0 to 1 as to how much of data is matched and then the mapping of the data will be done and the data will be merged. This score will be generated using an algorithm.
For example, if there are different tables where the column is labeled as “col”, “column”, “col1”, but the data which is shared in the columns are the same, then the data is checked, a hash will be generated against that column, a score between 0 to 1 is generated and then the mapping of the data takes place by merging the columns.