Curating and Reviewing Record Clusters
Understand expert sourcing and the Curator and Reviewer roles in a mastering project.

The continuous, iterative curation of low-confidence clusters and highly similar clusters allows Tamr to accurately cluster all records into distinct entities.
A record cluster is a group of one or more records that represent the same distinct entity. In a typical data mastering project, cluster size ranges from one record, known as a singleton cluster, to thousands or tens of thousands of records in a cluster.
Curating and Reviewing
Once the Curator initializes and runs the entity resolution model, Tamr classifies all record pairs and generates the first iteration of record clusters. Tamr identifies low confidence record clusters and highly similar clusters, and assigns them to Reviewers to be Merged or Split and Locked or Unlocked.
The locking of cluster records (completely or partially) influences the clustering and record pair generation algorithms. Locking a record tells Tamr not to move the record from its cluster. Further, a locked cluster record remains in its cluster and cannot be grouped with a locked record of a another cluster. As the model improves, new and existing records are continually clustered together with both locked and unlocked clusters. See Curating Record Clusters.
The continuous, iterative curation of low-confidence clusters and highly-similar clusters allows Tamr to accurately cluster all records into distinct entities.
Updated over 5 years ago