HomeTamr Core GuidesTamr Core API Reference
Tamr Core GuidesTamr Core API ReferenceTamr Core TutorialsEnrichment API ReferenceSupport Help CenterLog In

Curating Record Pairs

Curate record pairs with assignments and responses.

Assigning and Curating Record Pairs

Both curators and verifiers assign and verify record pairs in the mastering project workflow. See the following pages first:

Applying Feedback and Updating Mastering Results

The Apply Feedback and Update Results job:

  • Trains the model using the latest verified record pair feedback.
  • Applies the model to all candidate pairs (replaces system pair suggestions with new ones).
  • Clusters using the latest record cluster feedback.
  • Assigns persistent IDs to clusters on initial run only (publishes).
  • Generates new high-impact pairs.

Tip: It is a good practice to label at least 50 record pairs, including both matching and non-matching pairs if possible, before first initiating this job. See Viewing and Verifying Record Pairs.

To apply feedback and update mastering results:

  1. Navigate to the Pairs page.
  2. Select Apply Feedback and update results. See Monitoring Job Status.

Updating Mastering Results

The Update Results job:

  • Applies the model to all candidate pairs (replaces system pair suggestions with new ones).
  • Clusters using the latest record cluster feedback.
  • Assigns persistent IDs to clusters on initial run only (publishes).

Tip: The Update results job can only be run after at least one Apply feedback and update results job is complete.

To update mastering results:

  1. Navigate to the Pairs page.
  2. Select the dropdown arrow next to Apply feedback and update results, and then select Update results only. See Monitoring Job Status.

Viewing In-Sample Pair Metrics

Each time you apply feedback and update results, the model recomputes the following performance metrics by comparing the model predictions with verified feedback from experts:

  • Accuracy: The ratio of correct Tamr Core predictions to all pairs with expert feedback; the overall correctness of suggestions.
  • Precision: The ratio of correct matches to all matches predicted by the model. This is a measure of the effectiveness of finding true matching pairs, measured by (True Positives)/(True Positives + False Positives).
  • Recall: The ratio of correct Tamr matching pairs to all matching pairs identified by expert feedback. This is a measure of how well Tamr Core does at not missing any matching pairs, measured by (True Positives)/(True Positives + False Negatives).
  • F score: The harmonic mean of the precision and recall, where 1 is perfect precision and recall.

Note: Checking these metrics for an upward trend as you iterate can help you evaluate the effect of the changes you make to your data, blocking model, or record pairs. These metrics compare system-generated pair suggestions to the verified responses from experts. They do not indicate how accurate the model is for all record pairs. See the precision and recall metrics for clusters for metrics computed for test records.

For more information about these metrics, see Precision and Recall.

To view a confusion matrix and in-sample pair metrics:

  1. Navigate to the Pairs page.
  2. In the bottom right corner choose Show details. The confusion matrix opens with the current values.
    Note: This confusion matrix can help you determine if Tamr mistakes are biased towards matching or non-matching labels.
A confusion matrix with metrics based on labeled pairs.A confusion matrix with metrics based on labeled pairs.

A confusion matrix with metrics based on labeled pairs.

Tip: To filter record pairs, you can click on each of the quadrants in this visualization.

  1. To show the computed values for the in-sample pair metrics, select View advanced metrics. A panel with these values appears to the right of the matrix (shown above).
  2. If cluster metrics are available for your mastering project, the View Cluster Accuracy option opens the precision and recall metrics for clusters.

Did this page help you?