Searching Records
Search records in the unified dataset using basic or advanced search syntax.
Basic Search
Search is case-insensitive.
Phrases
Exact phrases require double quotes.
"Machine Learning"
Attribute names
Attribute names require the prefix tamr__
.
tamr__vendor_name:Tamr
Logical Operators
The logical operators AND
, OR
and NOT
are capitalized.
tamr__vendor_name:("Tamr" OR "Datatamr") AND NOT tamr__part_description:deterministic
In addition, AND
, OR
and NOT
are interchangeable with &&
, ||
, and !
, respectively.
The operator +
(-
) requires the results must(must not) contain a term.
tamr__description:(plate +sheet)
Note that omitting an operator defaults to OR
.
tamr__description:(plate sheet)
and
tamr__description:(plate OR sheet)
are equivalent.
Advanced Search
Fixed Length Expressions
Unify supports the use of fixed-length variations use the single-width wildcard ?
.
tamr__part_id: BD???908
Variable Length Expressions
Unify supports the use of variable length expressions using the zero-or-more wildcard *
.
tamr__vendor_name: Tam*r
Empty or Blank values
First select all records, and then use the negation of keyword raw
to find empty or blank values.
(i.e. search for something true AND the column in question NOT being anything)
tamr__tamr_id:* AND NOT tamr__vendor_spend.raw:*
Regular Expressions
Regular expressions must be wrapped with a forward slash /
.
tamr__vendor_number: /0*100250/
Using regular expressions to search for numeric values
You can use regular expressions to search for all records that contain numbers in a given field. For example, to return records that contain digits 0-9 in vendor_number column:
tamr__vendor_number: /[0-9]/
Excluding Records
To exclude a keyword, use the *
operator to select all records, and NOT
to filter out the keyword.
tamr__vendor_name:(* NOT "Tamr")
To find records that do not have a level 2 categorization, search for all records that are categorized at Level 1, but not at Level 2.
tamr__Level1:* AND NOT tamr__Level2:*
Additional Information
See here for more information on wildcard and regular expression searching.
Columns With Spaces
To search a Unify field that has spaces in the name, escape each space with a \
.
tamr__Column\ With\ Spaces:value
Reserved Characters
The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
.
To find reserved characters, escape them with backslash \
.
tamr__vendor_formula:\(1\+1\)\=2
Advanced Metadata Search
If you categorize and add your own labels to records, Tamr attaches metadata to these records. Mastering and Categorization projects have their own lists of metadata that become associated with records or record pairs.
Categorization
Tamr adds the following metadata to categorized records:
Metadata | Suggested or Manual | Data type | Description |
---|---|---|---|
categoryId | Both | long | The unique ID of the category that will be used in searches. |
reason | Manual | String | The description that users add when categorizing records. You can search these descriptions by keyword, exact phrase, or a REGEX match. |
score | Suggested | Long | The confidence score that Tamr associates with the suggested categorization. The score range is [0,1]. |
timestamp | Both | Int | The time it took to categorize the record. |
username | Manual | String | The username of the user who created the manual categorization. |
To search any of these metadata
- Use the form
manualCategorization.[metadata]: <searchTerm>
, wheredatatype
is of the data types listed in the previous table. The capitalization style for specifying data type matters.
You can choose to replace manualCategorization
with suggestedCategorization
.
Mastering
The following metadata variables are available to search on the Clusters tab of a Mastering project. For more information, see Searching Cluster Records.
Metadata Variable | Example Search | Description |
---|---|---|
Cluster Id | cluster.id.raw:"<cluster-id>" | To find records in a current cluster. |
Published Cluster Id | publishedCluster.id.raw:"<cluster-id>" | To find records in a cluster last time it was published. |
Updated over 5 years ago