Returns a stream of MatchingCluster
Note: This endpoint compares incoming query records that pass the blocking model with records in an existing mastering project and returns the cluster that has the highest probability for a match. Along with the v1/projects/{project}:matchRecords endpoint, it replaces the functionality of the v1/projects/{project}:match endpoint which remains available for backward compatibility.
The request body is a set of external records to match against clusters in an existing mastering project. The external records must have the same attributes as the unified dataset for the project, separated by newlines.
Requirements for Streaming Records
To provide multiple records as an input, or stream records, follow these guidelines:
- Swagger endpoints available within Tamr Core do not support streaming. To add multiple input records, use cURL for this endpoint.
- When making an LLM match request with cURL, use
--data-binary
instead of the-d
option.
Response Fields
The probability that records will match clusters is returned as a response stream, so matches are returned as soon as the first batch of records is processed. For clusters, the response output is similar to the following example:
{"entityId": "8793219", "clusterId": "c3", "avgMatchProb": 0.73}
{"entityId": "8800364", "clusterId": "c2", "avgMatchProb": 0.89}
Cluster Parameters
Field | Description |
---|---|
entityId | The ID of the record from the POST body. |
clusterId | The ID of the cluster the record was compared against. |
avgMatchProb | The average of the matching probability for the record against each record in the cluster. |
API Properties
- Request Type: Synchronous. Match requests use the mastering project's most recent model.
- Request Processing: Streaming
- Response Processing: Streaming
Note: This method has two query parameters that are available in limited release: transform
and defaultSourceName
. Tamr recommends using these parameters for testing purposes only.