HomeTamr Core GuidesTamr Core API Reference
Tamr Core GuidesTamr Core API ReferenceTamr Core TutorialsEnrichment API ReferenceSupport Help CenterLog In

Explode

EXPLODE creates one row for every element in an array.

You can EXPLODE an attribute if you want a separate row for each value in an array.

The template for EXPLODE is
EXPLODE attribute; or EXPLODE expression AS attribute;

You can only EXPLODE attributes or expressions which are arrays. If an attribute or expression contains nested arrays, you can EXPLODE the attribute multiple times. Each statement, however, can only include one EXPLODE at a time.

importantimportant Important: Changing an EXPLODE statement can result in an unintentional change to tamr_id values in the unified dataset. To avoid data loss and maintain the stability of tamr_id in mastering and categorization projects, as a best practice Tamr recommends that you explicitly set the tamr_id to a value you're sure is stable in the last transformation. See Managing Primary Keys.

Basic Example

Your data looks like this, with an array of strings in the Name attribute.

State

Name

MA

John, Smith

NY

Carol, Baldwin

When you write the statement EXPLODE Name, each element of the Name array is separated into its own row:

State

Name

MA

John

MA

Smith

NY

Carol

NY

Baldwin

INNER and OUTER

There are two forms of EXPLODE:

  • EXPLODE INNER drops the corresponding row when the attribute or expression is an empty array or null.
  • EXPLODE OUTER preserves empty arrays and nulls, generating a null in the exploded row instead.

If you do not specify INNER or OUTER, EXPLODE is INNER by default.

Examples of these two forms follow, using the same initial data containing different types of empty and null values. A description is provided for each value so you can see how they are treated differently.

State

Name

Description

MA

John, Smith

No nulls or empty values

NY

Carol

A single value in an array

CA

Alex, null

The second value in the array is null

MI

,

Two empty strings in an array

CO

null

No array, the value in this cell is null

TX

null , null

An array of nulls

KY

An empty array

If you EXPLODE INNER Name, the result is:

State

Name

Description

MA

John

No nulls or empty values

MA

Smith

No nulls or empty values

NY

Carol

A single value in an array

CA

Alex

The second value in the array is null

CA

null

The second value in the array is null

MI

Two empty strings in an array

MI

Two empty strings in an array

TX

null

An array of nulls

TX

null

An array of nulls

Notice how the rows for CO and KY disappeared: INNER drops a row if it is an empty array or null. Any nulls or empty values in an array, however, receive their own row. You can see this behavior for CA, MI, and TX: these values were each given their own row.

Alternatively, if you write EXPLODE OUTER Name, the result is:

State

Name

Description

MA

John

No nulls or empty values

MA

Smith

No nulls or empty values

NY

Carol

A single value in an array

CA

Alex

The second value in the array is null

CA

null

The second value in the array is null

MI

Two empty strings in an array

MI

Two empty strings in an array

CO

null

No array, the value in this cell is null

TX

null

An array of nulls

TX

null

An array of nulls

KY

null

An empty array

Here you can see the null value for CO is generated by a new null, and a new null is also generated for the empty array in KY.


Did this page help you?