User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Searching Records

Search records in the unified dataset using basic or advanced search syntax.

Basic Search

Search is case-insensitive.

Phrases

Exact phrases require double quotes.

"Machine Learning"

Attribute names

Attribute names require the prefix tamr__.

tamr__vendor_name:Tamr

Logical Operators

The logical operators AND, OR and NOT are capitalized.

tamr__vendor_name:("Tamr" OR "Datatamr") AND NOT tamr__part_description:deterministic

In addition, AND, OR and NOT are interchangeable with &&, ||, and !, respectively.

The operator +(-) requires the results must(must not) contain a term.

tamr__description:(plate +sheet)

Note that omitting an operator defaults to OR.

tamr__description:(plate sheet)

and

tamr__description:(plate OR sheet)

are equivalent.

Advanced Search

Fixed Length Expressions

Unify supports the use of fixed-length variations use the single-width wildcard ?.

tamr__part_id: BD???908

Variable Length Expressions

Unify supports the use of variable length expressions using the zero-or-more wildcard *.

tamr__vendor_name: Tam*r

Empty or Blank values

First select all records, and then use the negation of keyword raw to find empty or blank values.
(i.e. search for something true AND the column in question NOT being anything)

tamr__tamr_id:* AND NOT tamr__vendor_spend.raw:*

Regular Expressions

Regular expressions must be wrapped with a forward slash /.

tamr__vendor_number: /0*100250/

📘

Using regular expressions to search for numeric values

You can use regular expressions to search for all records that contain numbers in a given field. For example, to return records that contain digits 0-9 in vendor_number column:

tamr__vendor_number: /[0-9]/

Excluding Records

To exclude a keyword, use the * operator to select all records, and NOT to filter out the keyword.

tamr__vendor_name:(* NOT "Tamr")

To find records that do not have a level 2 categorization, search for all records that are categorized at Level 1, but not at Level 2.

tamr__Level1:* AND NOT tamr__Level2:*

📘

Additional Information

See here for more information on wildcard and regular expression searching.

Columns With Spaces

To search a Unify field that has spaces in the name, escape each space with a \.

tamr__Column\ With\ Spaces:value

Reserved Characters

The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /.

To find reserved characters, escape them with backslash \.

tamr__vendor_formula:\(1\+1\)\=2

Advanced Metadata Search

If you categorize and add your own labels to records, Tamr attaches metadata to these records. Mastering and Categorization projects have their own lists of metadata that become associated with records or record pairs.

Categorization

Tamr adds the following metadata to categorized records:

MetadataSuggested or ManualData typeDescription
categoryIdBothlongThe unique ID of the category that will be used in searches.
reasonManualStringThe description that users add when categorizing records. You can search these descriptions by keyword, exact phrase, or a REGEX match.
scoreSuggestedLongThe confidence score that Tamr associates with the suggested categorization. The score range is [0,1].
timestampBothIntThe time it took to categorize the record.
usernameManualStringThe username of the user who created the manual categorization.

To search any of these metadata

  1. Use the form manualCategorization.[metadata]: <searchTerm>, where datatype is of the data types listed in the previous table. The capitalization style for specifying data type matters.

You can choose to replace manualCategorization with suggestedCategorization.

Mastering

The following metadata variables are available to search on the Clusters tab of a Mastering project. For more information, see Searching Cluster Records.

Metadata VariableExample SearchDescription
Cluster Idcluster.id.raw:"<cluster-id>"To find records in a current cluster.
Published Cluster IdpublishedCluster.id.raw:"<cluster-id>"To find records in a cluster last time it was published.