Enrichment in the Mastering Workflow

Use enrichment as part of the mastering workflow to add enriched data to a unified dataset.

Tamr recommends the following workflow to incorporate enrichment into mastering projects.

  1. Perform schema mapping on your input datasets to create a single unified dataset. See Schema Mapping Workflow.
  2. Use the unified dataset as the input for each enrichment project. In the example above, the unified dataset is used as input for the Country Code, Phone Enrichment, and Email Enrichment projects.
  3. Use a script transformation to join the enrichment project output datasets with the unified dataset. For example:
LEFT OUTER JOIN WITH enrichment_output_dataset as results
on get(enrichment_input_dataset_primary_key,0) == results.tamr_id;
select *, 
results.valid as valid,
results.country_code as phone_country_code,
results.cleaned_number as cleaned_number,
results.national_format as national_format,
results.international_format as international_format,
results.region as region, 
results.type as type,
results.carrier as carrier;
  1. Create the unified attributes added by enrichment. For example, after the above transformation you would create the attributes valid, phone_country_code, and so on. See Mapping Unified Attributes.
  2. Perform record mastering on the updated unified dataset and create golden records. See Mastering Project Workflow and Golden Records Workflow.

