User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In

Searching Records

Search records in the unified dataset using basic or advanced search syntax.

Basic Search

Search is case-insensitive.

Exact Phrases

To search for exact phrases, use double quotes.

"Machine Learning"

Attribute Names

To search for attribute names in datasets, use the prefix tamr__.

tamr__vendor_name:Tamr

Golden Records

To search for attribute names in golden records, use the prefix gr__.

gr__<attribute_name>:"<value>"

To search for an empty string in golden records, first select all records, and then use the negation of the raw keyword, to find empty or blank values:

gr__<attribute_name>:* AND NOT gr__<attribute_name>.raw:*

Logical Operators

The logical operators AND, OR and NOT are capitalized.
Instead of AND, OR and NOT, you can also use &&, ||, and !, respectively.

tamr__vendor_name:("Tamr" OR "Datatamr") AND NOT tamr__part_description:deterministic

The + operator requires that results contain a term.
The - operator requires the results do not contain a term.

For example:

tamr__description:(plate +sheet)

Note: Omitting an operator defaults to OR.

For example, the following queries are equivalent:

tamr__description:(plate sheet)

and

tamr__description:(plate OR sheet)

Range Filtering

To use a range filter in search, use the operators < (less than), <= (less than or equal), > (greater than), and >= (greater than or equal) . Although you can also use = (equals), this is equivalent to a normal search.

Note: Do not add spaces between the attribute name, the operator, or the filter boundary.

To run a search that compares string type values, use syntax as in the following example. This search filters to records where a string type attribute age is greater than or equal to 18.

tamr__age:>=18

You can combine range filtering in searches with other operators, for example:

tamr__age:(>=18 AND <=25)

Sorting for string type values is useful for filtering on dates. If dates use ISO format (YYYY-MM-DD), you can run searches that sort and compare string type values, as in the following example. This search finds records with dates in January 2020.

tamr__date:(>=2020-01-01 AND <2020-02-01)

To run a search that compares numeric values, specify .numeric after the attribute value, as in the following example. Both integer and floating point type numbers are supported.

range tamr__age.numeric:>=18

Advanced Search

Fixed Length Expressions

To run a search with fixed length expressions, use the ? wildcard for each character (single length).

tamr__part_id: BD???908

Variable Length Expressions

To run a search with variable length expressions, use the * wildcard for zero or more characters.

tamr__vendor_name: Tam*r

Blank values

To search for records with blank values, use the .raw keyword.

tamr__vendor_spend.raw:""

Null values

First select all records, and then use the negation of the .raw keyword to find null values. That is, search for something true, then use AND for the column in question that is NOT being anything.

tamr__tamr_id:* AND NOT tamr__vendor_spend.raw:*

Regular Expressions

Regular expressions must be wrapped with a forward slash /.

tamr__vendor_number: /0*100250/

Note: Use regular expressions to search for all records that contain numbers in a given field. For example, to return records that contain digits 0-9 in vendor_number column, use:

tamr__vendor_number: /[0-9]/

Searching for Values with Punctuation

In regular searches for attribute values, value text is tokenized by Elasticsearch. Tokenization throws away any special characters.

To search for characters that are excluded by tokenization, search on the .raw facet of an attribute. For example, instead of tamr__text:"search string", search on tamr__text.raw:"search string".

When searching using .raw, ensure that your search matches the entire text of the attribute values being searched and use regular expressions when needed. For example:

tamr__myattribute.raw:/.*\|.*/ 

Excluding Records

To run a successful search query that excludes items, first select records, and then use NOT to filter out some records from those results.

Note: Negative search queries require using the wildcard * operator in the first part of the query, and negative search in the second part. First, select all records with a wildcard, and then add a negative search to filter out records based on your criteria.

For example, to exclude a keyword from search, first use the wildcard * to select all records, and then use NOT to filter out the keyword.

tamr__vendor_name:(* NOT "Tamr")

To find records that do not have a level 2 categorization, first use the wildcard * to search for all records that are categorized with "Level1" category, and then exclude records categorized with "Level2" category, as in the following example:

tamr__Level1:* AND NOT tamr__Level2:*

To exclude records with a specified category, select all records using the wildcard *, and then exclude the category, specifying the categoryId, as in the following example:

* AND NOT suggestedCategorization.categoryId: 852

For more information, see the following topics in the Elasticsearch Reference:

Columns With Spaces

To search for an attribute that has spaces in the name, escape each space with a \.

tamr__Column\ With\ Spaces:value

Escaping Reserved Characters

The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /.

To run searches for items that contain reserved characters, escape them with a backslash \.

tamr__vendor_formula:\(1\+1\)\=2

Advanced Metadata Search for Mastering Projects

If you add your own labels to records, Tamr Core attaches metadata to these records. Mastering projects have their own lists of metadata that become associated with records or pairs.

To search for records or clusters on the Clusters tab of a Mastering project, you can use the following metadata.

  • Cluster Id. Use this variable to find records in a newly formed cluster (that is, a cluster that does not yet have a persistent cluster Id). For example, cluster.id.raw:"<cluster-id>".
  • Published Cluster Id. Use this variable to find records associated with a persistent cluster Id. For example, publishedCluster.id.raw:"<cluster-id>".
  • Suggested Cluster Id. Use this variable to find records that have a suggested association with a given cluster. For example, suggestedClusterId:"<cluster-id>"
  • Cluster Name. Use this variable to find a cluster by name. For example, cluster.name.raw:"<name>"
  • Verification Type. Use this variable to find records that have a specified type of verification. For example, verificationType:"SUGGEST"

To search for clusters that contain more than 40 records, you can use this syntax:
cluster.recordCount:(>40)