User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In

Filtering Clusters

To make your work with clusters more efficient, you can apply filters to clusters, records, or both.

Based on feedback for record pairs, Tamr Core identifies records as duplicates and groups them into clusters. When you review clusters, you typically focus on a sample of representative clusters. Filters help you select clusters and records so that you can evaluate whether all records in a cluster are homogeneous and represent the same real world entity, and that no other records are missing from the cluster.

Filters for Clusters and Records

The Clusters page presents both a list of clusters on the left side of the page, and a list of records on the center-right side of the page. Each list has a dedicated filter to help you locate clusters and records for analysis, assignment, or review.

1210

The filter icon on the left side of the Clusters page applies to clusters, and the icon in the center of the page applies to records.

Applying a filter to the list of clusters on the left does not change the list of records shown. Similarly, applying a filter to the list of records does not change the list of clusters. However, note that when you select one or more clusters the list of records is reduced to show only records in those clusters. Applying a filter to that set of records can reduce the list further. As a result, you can, for example, filter to high-impact clusters only, select a cluster to examine further, and then filter to records from a specific source dataset within that high-impact cluster.

Different options for filtering are available for clusters and records.

Options for Filtering Clusters

You can select one or more of the following options to filter clusters.

FilterDescription and Options
My open assignmentsFilter to your open or resolved assignments.

Options:
- My open assignments
- My resolved assignments
High-impactFilter to high-impact clusters.

High-impact clusters are those clusters from which the Tamr model learns the most. In your initial cluster review, use the high-impact filter and curate all of the listed clusters. This helps ensure meaningful precision and recall metrics for clusters.
VerificationFilter by verification status of cluster and records.

Options:
- Has records verified in current cluster
- With move suggested
- Has records verified in another cluster
- Has no verified records
Average confidenceFilter by Tamr Core's average confidence in it's cluster suggestions.

Options:
- High
- Medium
- Low
- Custom range
SimilarityFilter by cluster similarity, which is the measure of how similar the records within the cluster are to one another.

Specify a percentage from 0 to 100.
Cluster changes from last publishFilter by clusters that have or have not changed since they were last published.

Options:
- Unchanged
- With changes
- Records added from new or updated sources
- Records moved from other clusters
- Records moved to other clusters
- Records deleted from sources
- New clusters
- Empty clusters
SourceFilter by selected source datasets.

Clusters meet the filter if they include one or more records from the selected datasets.

Options for Filtering Records

You can select one or more of the following options to filter records.

FilterDescription and Options
VerificationFilter by verification status of records.

Options:
- Verified
- Verified in current cluster
- Verified in another cluster
- Not verified
SuggestionsFilter by whether suggestions are enabled, disabled, or auto-suggested.

Options:
- Suggestions enabled
- Move suggested
- No move suggested
- Suggestions disabled
- Suggestions auto-accepted
CommentsFilter by whether reviewers have entered comments for the record.
Test recordsFilter by test records with and without problems to help identify issues with you data. Tamr Core uses test records to compute cluster precision and recall metrics. See test datasets

Options:
- Test records problems
- Test records with only precision problems
- Test records with only recall problems
- Test records with both precision and recall problems
- Test records with no problems
Record changes from last publishFilter by records that have changed since they were last published.

Options:
- New records added from updated sources
- Records moved between clusters
- Records stayed in current clusters
- Records deleted from updated sources
SourcesFilter by selected source datasets.

Records meet the filter if they are included in one or more of the selected datasets.

Removing Filters

To remove your filtering choices, select remove close next to filter filter.