Curating and Reviewing Record Clusters
Understand expert sourcing and the Curator and Reviewer roles in a mastering project.

The continuous, iterative curation of low-confidence clusters and highly similar clusters allows Tamr to accurately cluster all records into distinct entities.
A record cluster is a group of one or more records that represent the same distinct entity. In a typical data mastering project, cluster size ranges from one record, known as a singleton cluster, to thousands or tens of thousands of records in a cluster.
Curating and Reviewing
Once the Curator initializes and runs the entity resolution model, Tamr classifies all record pairs and generates the first iteration of record clusters. Tamr identifies low confidence record clusters and highly similar clusters, and assigns them to Reviewers. Reviewers can use the following four types of verification actions:
- Verify and Enable Suggestions,
- Verify and Disable Suggestions,
- Verify and Auto-Accept Suggestions, and
- Remove Verifications.
Note: The Verify and Disable Suggestions option is equivalent to Lock clusters option that was available in releases before 2019.026.
The locking of cluster records via Verify and Disable Suggestions influences the clustering and record pair generation algorithms. Using this option for a cluster record tells Tamr not to move the record from its cluster. Further, a locked cluster record remains in its cluster and cannot be grouped with a locked record of a another cluster. As the Tamr clustering model improves, new and existing records are continually clustered together with both locked and unlocked clusters. See Reviewing Clusters and Curating Clusters.
The continuous, iterative curation of low-confidence clusters and highly-similar clusters allows Tamr to accurately cluster all records into distinct entities.
Updated over 4 years ago