Working with Golden Records
Curators and Reviewers compose rules to consolidate the values of specific attributes, and review and edit the resulting golden records.
A curator composes attribute-specific consolidation rules to generate golden records from a dataset with a grouping key, typically the published clusters dataset resulting from a Mastering Project.
Curators and reviewers review the resulting records and directly edit their values and/or refine their consolidation rules.
Composing Consolidation Rules
For each golden record attribute, curators compose a consolidation rule. See Editing Golden Record Consolidation Rules.
Golden record consolidation rules comprise these parts:
- input attributes
- aggregation functions
- conditions
- expression aggregation functions
For convenience, when you create a Golden Records project, Tamr automatically generates a golden records dataset using the aggregation function most common value
with golden record attributes matching 1-to-1 with records from the input dataset. You can then customize the aggregation functions and conditions for each attribute.
Aggregation Functions
You can use an aggregation function in the consolidation rule on the golden records. The aggregation function is applied to the grouping key of the input dataset. For the list of aggregation functions you can use, see Aggregation Functions.
For example, applying the function most common value
to the attribute state
in the following dataset with grouping key published_id
returns the value Massachusetts
.
published_id | state |
---|---|
101 | Massachusetts |
101 | Massachusetts |
101 | Ohio |
101 | Massachusetts |
101 | Ohio |
101 | Ohio |
101 | Massachusetts |
Conditions
You can use conditions in the consolidation rule on golden records. The filter uses the condition for the grouping key in the input datasets. This way, you can filter down to the records that meet the condition before the aggregation function is applied. For the list of conditions, see Working with Golden Records.
Expression Aggregation Functions
Expression aggregation functions provide a code editor to compose custom aggregation functions written using Tamr Transformations. Just as in the case of Aggregation Functions, Expression Aggregation Functions are applied after Conditions. For more information, see Expression Aggregation Functions.
Editing Values
Curators and Reviewers review golden records and directly edit their values. These user-entered values override the value selected by the consolidation rule for an attribute. See Creating or Editing a Value Override for a Golden Record.
As consolidation rules are created and updated, a golden record's value overrides remain unaffected. The number of value overrides for a given attribute is displayed in the rules panel and may be filtered to. See Filtering To Records with Value Overrides.
Updated over 5 years ago