Defining the Blocking Model
Create a blocking model so that Tamr can generate pairs of records that match and pairs of records that don't match.
Adding a Blocking Clause
A blocking clause consists of one or more terms which are logically connected by AND. Each complete clause is connected by OR.
To add a blocking clause:
- Navigate to the Pairs page of a Mastering project.
- Choose Manage pair generation. A template for the first term appears. See the following section "Configuring a Blocking Term".
- Select Add another row to add another term to the clause.
- Create a different clause. Mouse over the area between two existing terms and click the "OR" separator that appears.

Configuring a Blocking Term
To configure a blocking term:
- Navigate to the Pairs page of a Mastering project.
- Click Manage pair generation.
- Select the unified attribute name.
- Select a similarity threshold (%) and similarity function.
- For text attributes, select a tokenizer.
See Tokenizers and Similarity Functions.
Estimating Record Pair Counts
To estimate record pair counts:
- Navigate to the Pairs page of a Mastering project.
- Click Manage pair generation.
- Configure the blocking terms in one or more clauses.
- Click Estimate Counts.
Excluding Pair Generation Within a Dataset
You can choose to exclude searching for match/no match pairs within a source dataset; for example, if a dataset is known to be free of duplicate records. For each source you exclude, record pairs are not generated from within that source, only between that source and other sources.
To exclude pair generation within a dataset:
- Navigate to the Pairs page of a Mastering project.
- Click Manage pair generation.
- Select Open exclusions.
- Click + add source and choose a dataset to exclude.
Generating Record Pairs
Note: After making any changes to your blocking model, be sure to re-estimate record pair counts before you generate record pairs.
To generate record pairs:
- Navigate to the Pairs page of a Mastering project.
- Click Manage pair generation.
- Click Generate Pairs.
Updated over 4 years ago