Tamr Documentation

Low-Latency Matching External Records

Given a small number of incoming or external records (up to a thousand), identify matches between the incoming records and existing records or records and clusters.

Low-latency match (LLM) is an API-only Tamr feature that provides fast match suggestions for a small batch of records. Send a small number of records (<10,000) to Tamr’s match API, and Tamr quickly returns the most likely match(es) for that record, without the need to rerun pair generation and clustering jobs. Tamr uses minimal computational resources during this process; even if the number of records stored in Tamr is large (for example, 10s of millions), Tamr is able to return an answer within seconds.

Note: The low-latency matching service is available on port 9170, not the default port (9100). (See the Configuration Variable Reference for information about changing the match service port.} The endpoints for Query LLM Status, Query Last LLM Update, and Perform LLM Match are different than the endpoint for Update LLM Data. Contact Tamr support for configuration assistance.

Prerequisites

  • At least one mastering project exists (Creating a Project).
  • At least one dataset is added to the project and is schema mapped to the project's unified dataset (Adding a Dataset).
  • Generate Record Pairs and the Update Results jobs have both been executed OR a mastering model has been imported (Importing a Mastering Model).
  • Input records include non-null data attributes used in the blocking model, and must be pre-processed in the same way as data in processed by transformations in the mastering project.
  • The port for the low-latency matching service (default 9170) is open to allow inbound access for your application.

Typically, after you make changes that affect clusters, you publish them and then use a matching service. If there no changes (clusters have not been published since the last time you used a matching service) -1 is returned.

Using Low-Latency Match

To use low-latency match, you prepare, update, and then match.

Preparing Low-Latency Match

Prepare a project to use LLM.Prepare a project to use LLM.

Prepare a project to use LLM.

To prepare a mastering project to use the low-latency match service, you publish clusters (if needed).

  1. Wait For Queryable: GET /v1/projects/{project}:isQueryable
    Poll the status of the update job submitted in Step 1 until the response true is received.
  2. Publish Clusters: POST /v1/projects/{project}/publishedClusters:refresh

Updating Low-Latency Match

To update a mastering project's data queryable for low-latency matching:

  1. Publish Clusters: POST /v1/projects/{project}/publishedClusters:refresh
  2. Update LLM Data: POST /v1/projects/{project}:updateLLM

Low-Latency Matching

To match incoming external records at low latency:

  1. Low-Latency Match: POST /v1/projects/{project}:match
    Submit records in a stream for low-latency matching against the latest queryable records or clusters, where project is the name of a mastering project.

Updated about a month ago



Low-Latency Matching External Records


Given a small number of incoming or external records (up to a thousand), identify matches between the incoming records and existing records or records and clusters.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.