User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Explode

EXPLODE creates one row for every element in an array.

You can EXPLODE an attribute if you want a separate row for each value in an array.

The template for EXPLODE is
EXPLODE attribute; or EXPLODE expression AS attribute;

You can only EXPLODE attributes or expressions which are arrays. If an attribute or expression contains nested arrays, you can EXPLODE the attribute multiple times. Each statement, however, can only include one EXPLODE at a time.

important Important: Changing an EXPLODE statement can result in an unintentional change to tamr_id values in the unified dataset. To avoid data loss and maintain the stability of tamr_id in mastering and categorization projects, as a best practice Tamr recommends that you explicitly set the tamr_id to a value you're sure is stable in the last transformation. See Managing Primary Keys.

Basic Example

Your data looks like this, with an array of strings in the Name attribute.

StateName
MAJohn, Smith
NYCarol, Baldwin

When you write the statement EXPLODE Name, each element of the Name array is separated into its own row:

StateName
MAJohn
MASmith
NYCarol
NYBaldwin

INNER and OUTER

There are two forms of EXPLODE:

  • EXPLODE INNER drops the corresponding row when the attribute or expression is an empty array or null.
  • EXPLODE OUTER preserves empty arrays and nulls, generating a null in the exploded row instead.

If you do not specify INNER or OUTER, EXPLODE is INNER by default.

Examples of these two forms follow, using the same initial data containing different types of empty and null values. A description is provided for each value so you can see how they are treated differently.

StateNameDescription
MAJohn, SmithNo nulls or empty values
NYCarolA single value in an array
CAAlex, null The second value in the array is null
MI,Two empty strings in an array
COnullNo array, the value in this cell is null
TXnull , nullAn array of nulls
KYAn empty array

If you EXPLODE INNER Name, the result is:

StateNameDescription
MAJohnNo nulls or empty values
MASmithNo nulls or empty values
NYCarolA single value in an array
CAAlexThe second value in the array is null
CAnullThe second value in the array is null
MITwo empty strings in an array
MITwo empty strings in an array
TXnullAn array of nulls
TXnullAn array of nulls

Notice how the rows for CO and KY disappeared: INNER drops a row if it is an empty array or null. Any nulls or empty values in an array, however, receive their own row. You can see this behavior for CA, MI, and TX: these values were each given their own row.

Alternatively, if you write EXPLODE OUTER Name, the result is:

StateNameDescription
MAJohnNo nulls or empty values
MASmithNo nulls or empty values
NYCarolA single value in an array
CAAlexThe second value in the array is null
CAnullThe second value in the array is null
MITwo empty strings in an array
MITwo empty strings in an array
COnullNo array, the value in this cell is null
TXnullAn array of nulls
TXnullAn array of nulls
KYnullAn empty array

Here you can see the null value for CO is generated by a new null, and a new null is also generated for the empty array in KY.