Mastering Project Workflow
Curate record pairs and clusters in a mastering project.
Curator is the role given to subject matter experts who compose business rules and apply or reject the validation responses made by other team members.
When you are a curator for a mastering project, you:
- Add datasets to your project, create the unified schema, map input attributes to the attributes of the unified schema, and specify which attributes the model should use to find potential matches. See Creating the Unified Dataset for Mastering.
- Enable the record grouping feature and select attributes that are key to identifying records that are exact matches. See Grouping Obvious Duplicates.
- Define a blocking model to give the model guidelines about what makes records similar enough to possibly match and eliminate records that are too dissimilar to ever match. See Defining the Blocking Model.
- Assist verifiers by training initial pairs and then by assigning pairs to team members for review.
- Optionally, contribute to the team effort of reviewing, labeling, and verifying pairs as matching or non-matching, and reviewing and verifying records in clusters.
- Optionally, if you enable learned pairs, your feedback on clusters helps Tamr Core make more accurate Match and No Match labels. See learned pairs.
- Determine when to iterate the mastering project workflow by initiating the jobs that apply expert feedback, and update the machine learning models for grouping, blocking, pairing, and record clustering. See Managing Jobs.
Updated over 1 year ago