User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Formulas

Use formulas to automatically add a single expression into a well-formed SELECT statement.

Formulas operate on values within one column and access values in other columns within the row, offering a more structured option for adding simple transformations, However, formulas and scripts use the same building blocks, expressions and functions, so you can use formulas to express many transformation scripts that begin with SELECT *. Common use cases for Formula are concatenation, conditional logic, arithmetic operations, and replication of part of another column or an entire column.

The following tips discuss functions which you may find useful when creating formulas.

The default data type for all columns is to be arrays of strings, even if they contain only one value. Therefore, certain additional array functions are available in Tamr to handle transformations correctly. The following is a list of useful array functions:

  • array converts a string into an array. This allows you to keep columns consistently typed.
  • len is useful when writing CASE statements and allows you to handle different array lengths. This function differs from the length function. length returns the length of a string.
  • to_string allows you to access a particular index of a multi-value field. For example, to_string(<attributeName>) is the preferred way to access strings from a single-value field.
  • array.concat allows you concatenate arrays. Array concatenation adds values to an array in such a way that array_concat(["A", "B"]) and array.concat(["AB"]) both end up as "AB".

The output attribute is the one to which output is written. Note that this overwrites any values already in that attribute. Attribute names are case sensitive, but function names and keywords, such as case, when, and empty are not.

798