HomeTamr Core GuidesTamr Core API Reference
Tamr Core GuidesTamr Core API ReferenceTamr Core TutorialsEnrichment API ReferenceSupport Help CenterLog In

Case

You use case expressions to create a rule-based formula, where different cases have different outputs.

Overview

case expressions are similar to "if-else statements" in programming languages.

  • case expressions can include any number of WHEN clauses and can optionally include an ELSE clause.
  • If you omit the ELSE clause, cases where none of the WHEN conditions are met are filled with the value null.
  • If the data satisfies multiple WHEN conditions, the output of the first WHEN statement is used.

You can write case expressions in two styles, simplified and flexible.

Simplified Case Expressions

You can use a simplified case expression in which a single attribute is directly compared with values present in the WHEN clauses. In this example, the attribute company_code is populated based on the contents of the attribute company_type.

A formula starting CASE company_type that then uses three WHEN clauses to identify different values and populate a different attribute with abbreviations of those values.A formula starting CASE company_type that then uses three WHEN clauses to identify different values and populate a different attribute with abbreviations of those values.

A formula starting CASE company_type that then uses three WHEN clauses to identify different values and populate a different attribute with abbreviations of those values.

If company_type is equal to the string 'Private', then company_code is populated with the string PRV. Similarly, a value of 'Public' in company_type results in a value of 'PUB' for company_code, and a value of 'Partnership' results in a value of 'PRT'. In this example the optional ELSE clause was not included, so company_code will be null when company_type does not match any of the options listed.

A preview of the resulting dataset, with a company_type attribute and the populated company_code attribute, which has null for any value not included in the WHEN clauses.A preview of the resulting dataset, with a company_type attribute and the populated company_code attribute, which has null for any value not included in the WHEN clauses.

A preview of the resulting dataset, with a company_type attribute and the populated company_code attribute, which has null for any value not included in the WHEN clauses.

Flexible Case Expressions

The second style of case expression is more flexible, allowing the use of multiple attributes in the WHEN clauses and more complex conditional statements, such as the array.contains() function. The following example creates a region column based on whether the value of the state column is within an array of states. It also ensures that the country column has the value 'USA' before applying these regional rules.

A CASE statement followed by a series of WHEN clauses, beginning with WHEN country == 'USA' AND array.contains(array('AL', 'AR', 'FL', 'LA'), state) THEN 'South' and similar WHEN clauses to identify other regions using values for state, followed by ELSE 'UNKNOWN'.A CASE statement followed by a series of WHEN clauses, beginning with WHEN country == 'USA' AND array.contains(array('AL', 'AR', 'FL', 'LA'), state) THEN 'South' and similar WHEN clauses to identify other regions using values for state, followed by ELSE 'UNKNOWN'.

A CASE statement followed by a series of WHEN clauses, beginning with WHEN country == 'USA' AND array.contains(array('AL', 'AR', 'FL', 'LA'), state) THEN 'South' and similar WHEN clauses to identify other regions using values for state, followed by ELSE 'UNKNOWN'.

A preview of the result shows the region attribute populated with the values specified by each WHEN clause.

A dataset with region values such as Northeast for a record with a state value of NJ and UNKNOWN for a state value of PA.A dataset with region values such as Northeast for a record with a state value of NJ and UNKNOWN for a state value of PA.

A dataset with region values such as Northeast for a record with a state value of NJ and UNKNOWN for a state value of PA.

This more flexible style for writing case expressions often relies on using logical comparators to find values that do, and do not, meet certain criteria.

Nested Case Expressions

In some situations, it may be beneficial to use a case expression within another case expression. The following example of nested case expressions produces a result similar to the previous, unnested, example. There is one small difference in the result. This example distinguishes between records that are not in the USA, which are assigned the value 'NON-USA', and records that are within the USA, but do not fall within one of the defined regions, which are assigned the value 'UNKNOWN USA'.

A CASE statement followed by a WHEN clauses of WHEN country == 'USA' THEN CASE followed by a series of nested WHEN clauses including WHEN array.contains(array('MA', 'NH', 'NJ'), state) THEN 'Northeast', followed by ELSE 'UNKNOWN USA' END ELSE 'NON-USA'.A CASE statement followed by a WHEN clauses of WHEN country == 'USA' THEN CASE followed by a series of nested WHEN clauses including WHEN array.contains(array('MA', 'NH', 'NJ'), state) THEN 'Northeast', followed by ELSE 'UNKNOWN USA' END ELSE 'NON-USA'.

A CASE statement followed by a WHEN clauses of WHEN country == 'USA' THEN CASE followed by a series of nested WHEN clauses including WHEN array.contains(array('MA', 'NH', 'NJ'), state) THEN 'Northeast', followed by ELSE 'UNKNOWN USA' END ELSE 'NON-USA'.

A preview of the result shows the region attribute populated with the values specified by each WHEN clause, records with a country of CANADA populated with NON-USA, and CT and PA as UNKNOWN USA.

A dataset with region values of Northeast, Midwest, NON-USA, and UNKNOWN USA.A dataset with region values of Northeast, Midwest, NON-USA, and UNKNOWN USA.

A dataset with region values of Northeast, Midwest, NON-USA, and UNKNOWN USA.

Case Expressions in Scripts

You can also include case expressions in scripts. To do this, use SELECT * before the beginning of your statement and include AS <target_attribute> after the END.


Did this page help you?