User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In

Mastering Project Workflow

Curate record pairs and clusters in a mastering project.

Curator is the role given to subject matter experts who compose business rules and apply or reject the validation responses made by other team members.

When you are a curator for a mastering project, you:

  • Add datasets to your project and profile them. See Working with Datasets in Projects.
  • Create the unified schema for your project.
  • Map input attributes to the attributes of a unified schema, and specify which attributes the model should use to find potential matches. See Creating the Unified Dataset for Mastering.
  • Define a blocking model to give the model guidelines about what makes records similar enough to possibly match and eliminate records that are too dissimilar to ever match. See Defining the Blocking Model.
  • Assist verifiers by training initial pairs and then by assigning record pairs to team members for review.
  • Optionally, contribute to the team effort of reviewing, labeling, and verifying record pairs as matching or non-matching pairs, and reviewing and verifying records in clusters.
  • Optionally, if you enable learned pairs, your feedback on clusters helps Tamr Core make more accurate Match and No Match labels. See learned pairs.
  • Determine when to iterate the mastering project workflow by initiating the jobs that apply expert feedback, and update the machine learning models for blocking, record pairing, and record clustering. See Managing Jobs.
  • Optionally, include enriched data as part of the mastering project, if enrichment is enabled for your organization. See Managing Enrichment Projects.
818

The iterative workflow of mastering projects.