Based on record grouping criteria and feedback for pairs, Tamr Core identifies records as duplicates and organizes them into clusters.
Tip: When Tamr Core initially generates clusters, all records that are in the same record group are assigned to the same cluster. The groups themselves do not appear on the Clusters page.
When you review clusters, you typically focus on a sample of representative clusters. Filters help you select clusters and records so that you can evaluate whether all records in a cluster are homogeneous and represent the same real world entity, and that no other records are missing from the cluster.
The Clusters page presents both a list of clusters on the left side of the page, and a list of records on the center-right side of the page. Each list has a dedicated filter to help you locate clusters and records for analysis, assignment, or review.
Applying a filter to the list of clusters on the left does not change the list of records shown. Similarly, applying a filter to the list of records does not change the list of clusters. However, note that when you select one or more clusters the list of records is reduced to show only records in those clusters. Applying a filter to that set of records can reduce the list further. As a result, you can, for example, filter to high-impact clusters only, select a cluster to examine further, and then filter to records from a specific source dataset within that high-impact cluster.
Different options for filtering are available for clusters and records.
You can select one or more of the following options to filter clusters.
Description and Options
My open assignments
Filter to your open or resolved assignments.
Filter to high-impact clusters.
High-impact clusters are those clusters from which the Tamr model learns the most. In your initial cluster review, use the high-impact filter and curate all of the listed clusters. This helps ensure meaningful precision and recall metrics for clusters.
Filter by verification status of cluster and records.
Filter by Tamr Core's average confidence in it's cluster suggestions.
Filter by cluster similarity, which is the measure of how similar the records within the cluster are to one another.
Specify a percentage from 0 to 100.
Cluster changes from last publish
Filter by clusters that have or have not changed since they were last published.
|Source||Filter by selected source datasets.Clusters meet the filter if they include one or more records from the selected datasets.|
You can select one or more of the following options to filter records.
Description and Options
Filter by verification status of records.
Filter by whether suggestions are enabled, disabled, or auto-suggested.
Filter by whether reviewers have entered comments for the record.
Record changes from last publish
Filter by records that have changed since they were last published.
|Sources||Filter by selected source datasets.Records meet the filter if they are included in one or more of the selected datasets.|
To remove your filtering choices, select remove next to filter .
Updated 7 months ago