User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Statement Modifiers

Apply labels and hints to statements in your transformation scripts. Use scope syntax to apply labels and hints to multiple statements at a time.

Label

To provide a name and save the state of a dataset output of any transformation statement, you can apply a label. You can later reference this label anywhere a dataset could be referenced, such as in JOIN, UNION, or USE statements.

Label examples:

my_label: SELECT *;
A_different_label: FILTER company_name IS NOT NULL;

Here is how you can reference a labeled dataset:
USE my_label;

Hint

Hints are an advanced feature that allows you to modify how Tamr runs transformations. Add the HINT statement directly before the transformation that you would like to modify.

Value for HINTDescriptionExample
join.broadcastThis value for the HINT statement changes the way Spark completes a join operation. A broadcast join is more efficient than the default join when the joined dataset is much smaller than the unified dataset. Note: Use this HINT value with caution and consult with Tamr Support.hint(join.broadcast) JOIN WITH "my_small_data.csv" AS sm on id == sm.id;
hint(pkmanagement.manual)This value for the HINT statement disables the automatic management (assignment) of the primary key (tamr_id) for records in unified datasets. If you use this HINT value, there is no guarantee that tamr_id is a unique string for every record of the unified dataset. You will need to ensure the uniqueness of your primary key manually. See Primary Key Management.hint(pkmanagement.manual) MERGE BY city;
hint(pkmanagement.auto)This value for the HINT statement enables the automatic management of the primary key of unified datasets (tamr_id). See Primary Key Management.hint(pkmanagement.auto) MERGE BY city;
hint(pkmanagement.default)This value for the HINT statement sets the automatic management of the primary key to the default value. Since the current default is having primary key management enabled, this is the same as hint(pkmanagement.auto).

Since this is the default behavior for managing primary keys, this HINT statement can be omitted and primary key management will have the same default behavior.
hint(pkmanagement.default) MERGE BY city;

Scope

Scoping allows you to apply a HINT or a LABEL to a group of transformations rather than to a single statement. A scope begins with a { and ends with a }.

For example, use scope to label the output of three statements:

my_label: {
  SELECT *, str_join(', ', array(address_line1, city, state, zip)) AS full_address;
  FILTER full_address IS NOT EMPTY;
  MERGE BY full_address;
};

// Use scope to apply a `HINT` to two joins:

hint(join.broadcast) {
  JOIN WITH "state_lkp.csv" AS st ON state == st.name;
  JOIN WITH "price_lkp.csv" AS pr ON part_name == pr.product;
};

What’s Next