Tamr Documentation
Home
Changelog
Guides
API Reference
Tutorials
{{user.name}}
Log In
Tamr Documentation
Tamr Documentation
Previous Tamr Documentation
Tamr Tutorials
v2021.3.0-2021.6.0
v2019.019
v2019.021
v2019.023.1
v2019.025
v2020.003
v2020.004
v2020.005
v2020.006
v2020.007
v2020.008
v2020.009
v2020.010
v2020.011
v2020.012
v2020.13.0-2020.16.4
v2020.17.0-2020.20.2
v2020.21.0-2020.24.1
v2020.25.0-2021.2.0
v2021.3.0-2021.6.0
stable
Home
Guides
Recipes
API Reference
Reference
Changelog
Discussions
Page Not Found
Search
{{ state.current().meta.title }}
API Logs
Home
Guides
API Reference
Changelog
{{search.symbol}}K
discard
Submit
Suggested Edits
Release Notes
Release Notes
Tamr Overview
Solving Data Quality Challenges with Tamr
Schema Mapping Projects
Mastering Projects
Curating and Reviewing Record Clusters
Golden Records Projects
Categorization Projects
Categorization Tasks by Team Member Role
Transformations
Tokenizers and Similarity Functions
Working with Geospatial Data
Geospatial Data Types
Glossary
Reviewer User Guide
Navigating Data
Viewing Records
Searching Records
Opening a URL in the Record's Data
Reviewing Record Pairs and Clusters
Filtering Record Pairs
Reviewing and Labeling Record Pairs
Reviewing Categorizations
Filtering Categorizations
Categorizing Records
Curator User Guide
Schema Mapping Workflow
Unified Attributes
Working with Unified Attributes
Mapping Recommendations
Generating Attribute Recommendations
Previewing the Unified Dataset
Mastering Project Workflow
Working with the Unified Dataset
Working with Record Pairs
Defining the Blocking Model
Assigning Record Pairs
Curating Record Pairs
Reviewing Clusters
Filtering Clusters
Publishing Clusters
Assigning Clusters
Curating Clusters
Examples of Cluster ID Changes
Golden Records Workflow
Working with Golden Records
Golden Record Consolidation Rules
Managing Record Categorizations
Working with the Unified Dataset
Working with the Taxonomy
Taxonomy Design Principles
Using the Categorization Curator Dashboard
Categorizing Records
Assigning Records in Categorization Projects
Filtering Records in Categorizaton Projects
Curating Record Categorizations
Working with Taxonomies
Uploading a Taxonomy File
Renaming a Taxonomy
Deleting a Taxonomy
Navigating a Taxonomy
Adding a Category
Renaming a Category
Removing a Category
Administrator User Guide
Working with Datasets
Uploading a Dataset into Tamr
Profiling a Dataset
Previewing a Dataset
Exporting a Dataset
Deleting a Dataset From All Projects
Filtering Datasets by Type
Managing Dataset Tags
Datasets Generated by Tamr
Attaching a Policy to a Dataset
Working with Tamr Projects
Creating a Project
Editing a Project
Deleting a Project
Adding a Dataset to a Project
Removing a Dataset from a Project
Managing User Accounts and Access
Users, Roles, and Groups
Permissions Matrix by User Role
User Policies
Navigating the Users Page
Auditing User Policies and Access
Creating a User
Editing a User's Information or Password
Editing a User's Roles
Editing a User's Groups
Editing a Group's Roles
Activating or Deactivating a User
Managing Jobs
Viewing Job Description Details
Viewing Job Status Details
Cancelling a Job
Configuring Jobs to Run Concurrently
System Administrator Guide
Deployment Options
Single-Node Deployments
Deploying Tamr on AWS
Deploying Tamr on Google Cloud Platform
Deploying Tamr on Azure
Installation
Requirements
Installing
Installing PostgreSQL
License Key
Restarting
Security Configurations
Configuration
Configuring Tamr
Command Reference
Backup Configuration
HDFS
Low-Latency Match Service
PostgreSQL
Spark Environment Configuration
YARN Cluster Manager Jobs
External Storage Providers
Auxiliary Services
HBase
Configuration Variable Reference
Validation, Upgrades, and Backups
Utilities for Validation and System-Wide Processes
Upgrading Tamr
Upgrading PostgreSQL
Backup
Restore
Security
HTTPS
LDAP Authentication and Authorization
SAML Authentication
Monitoring
Supported Monitoring Tools
Metrics
Querying Time Series Data: Examples
Logging
Transformations
Overview
Getting Help with Transformations
Managing Primary Keys
Data Types and Transformations
Working with Transformations for Geospatial Data
Using Fill, Unpivot, and Formula
Fill Option
Unpivot Option
Formulas
Writing Transformation Scripts
Example Tasks for Transformation Scripts
Working with Statements
Aggregating Records
Disaggregating Records
Removing Records
Referencing Other Datasets
Checkpoint
Drop
Explode
Filter
Group By
Join
Lookup
Merge
Order
Pivot
Repartition
Rows
Sample
Select
Union All
Unpivot
Use
Window
Statement Modifiers
Working with Expressions
Aggregating Expressions
Arithmetic Expressions
Case
Spread
Using Logical Comparators
Working with Dates
Working with Regular Expressions
Functions
General Functions
map, filter, and reduce
Array Functions
array.of
Mathematical Functions
Aggregate Functions
GIS Functions
Tips for Troubleshooting Transformations
Tamr Python Client
Tamr Python Client
Playbooks
Categorization Playbook
Batch Operation of a Categorization Project
Continuous Operation of a Categorization Project
Exporting and Importing a Categorization Model
Mastering Playbook
Batch Operation of a Mastering Project
Continuous Operation of a Mastering Project
Bulk Matching External Records
Low-Latency Matching External Records
Exporting and Importing a Mastering Model
Release Notes
Release Notes
Tamr Overview
Solving Data Quality Challenges with Tamr
Schema Mapping Projects
Mastering Projects
Curating and Reviewing Record Clusters
Golden Records Projects
Categorization Projects
Categorization Tasks by Team Member Role
Transformations
Tokenizers and Similarity Functions
Working with Geospatial Data
Geospatial Data Types
Glossary
Reviewer User Guide
Navigating Data
Viewing Records
Searching Records
Opening a URL in the Record's Data
Reviewing Record Pairs and Clusters
Filtering Record Pairs
Reviewing and Labeling Record Pairs
Reviewing Categorizations
Filtering Categorizations
Categorizing Records
Curator User Guide
Schema Mapping Workflow
Unified Attributes
Working with Unified Attributes
Mapping Recommendations
Generating Attribute Recommendations
Previewing the Unified Dataset
Mastering Project Workflow
Working with the Unified Dataset
Working with Record Pairs
Defining the Blocking Model
Assigning Record Pairs
Curating Record Pairs
Reviewing Clusters
Filtering Clusters
Publishing Clusters
Assigning Clusters
Curating Clusters
Examples of Cluster ID Changes
Golden Records Workflow
Working with Golden Records
Golden Record Consolidation Rules
Managing Record Categorizations
Working with the Unified Dataset
Working with the Taxonomy
Taxonomy Design Principles
Using the Categorization Curator Dashboard
Categorizing Records
Assigning Records in Categorization Projects
Filtering Records in Categorizaton Projects
Curating Record Categorizations
Working with Taxonomies
Uploading a Taxonomy File
Renaming a Taxonomy
Deleting a Taxonomy
Navigating a Taxonomy
Adding a Category
Renaming a Category
Removing a Category
Administrator User Guide
Working with Datasets
Uploading a Dataset into Tamr
Profiling a Dataset
Previewing a Dataset
Exporting a Dataset
Deleting a Dataset From All Projects
Filtering Datasets by Type
Managing Dataset Tags
Datasets Generated by Tamr
Attaching a Policy to a Dataset
Working with Tamr Projects
Creating a Project
Editing a Project
Deleting a Project
Adding a Dataset to a Project
Removing a Dataset from a Project
Managing User Accounts and Access
Users, Roles, and Groups
Permissions Matrix by User Role
User Policies
Navigating the Users Page
Auditing User Policies and Access
Creating a User
Editing a User's Information or Password
Editing a User's Roles
Editing a User's Groups
Editing a Group's Roles
Activating or Deactivating a User
Managing Jobs
Viewing Job Description Details
Viewing Job Status Details
Cancelling a Job
Configuring Jobs to Run Concurrently
System Administrator Guide
Deployment Options
Single-Node Deployments
Deploying Tamr on AWS
Deploying Tamr on Google Cloud Platform
Deploying Tamr on Azure
Installation
Requirements
Installing
Installing PostgreSQL
License Key
Restarting
Security Configurations
Configuration
Configuring Tamr
Command Reference
Backup Configuration
HDFS
Low-Latency Match Service
PostgreSQL
Spark Environment Configuration
YARN Cluster Manager Jobs
External Storage Providers
Auxiliary Services
HBase
Configuration Variable Reference
Validation, Upgrades, and Backups
Utilities for Validation and System-Wide Processes
Upgrading Tamr
Upgrading PostgreSQL
Backup
Restore
Security
HTTPS
LDAP Authentication and Authorization
SAML Authentication
Monitoring
Supported Monitoring Tools
Metrics
Querying Time Series Data: Examples
Logging
Transformations
Overview
Getting Help with Transformations
Managing Primary Keys
Data Types and Transformations
Working with Transformations for Geospatial Data
Using Fill, Unpivot, and Formula
Fill Option
Unpivot Option
Formulas
Writing Transformation Scripts
Example Tasks for Transformation Scripts
Working with Statements
Aggregating Records
Disaggregating Records
Removing Records
Referencing Other Datasets
Checkpoint
Drop
Explode
Filter
Group By
Join
Lookup
Merge
Order
Pivot
Repartition
Rows
Sample
Select
Union All
Unpivot
Use
Window
Statement Modifiers
Working with Expressions
Aggregating Expressions
Arithmetic Expressions
Case
Spread
Using Logical Comparators
Working with Dates
Working with Regular Expressions
Functions
General Functions
map, filter, and reduce
Array Functions
array.of
Mathematical Functions
Aggregate Functions
GIS Functions
Tips for Troubleshooting Transformations
Tamr Python Client
Tamr Python Client
Playbooks
Categorization Playbook
Batch Operation of a Categorization Project
Continuous Operation of a Categorization Project
Exporting and Importing a Categorization Model
Mastering Playbook
Batch Operation of a Mastering Project
Continuous Operation of a Mastering Project
Bulk Matching External Records
Low-Latency Matching External Records
Exporting and Importing a Mastering Model
Authenticate Requests
Authentication of API Requests
Versioning
Error Handling
Review Data Models
The Dataset Object
The Attribute Object
The Attribute Configuration Object
The Operation Object
The Project Object
The Published Clusters Configuration Object
Manage Datasets, Attributes, and Records
Manage Datasets
get
List all Datasets
get
Retrieve a Specific Dataset
post
Create a Dataset
put
Update a Dataset
delete
Delete a Dataset
post
Materialize a Dataset
get
Retrieve Upstream Dataset Usage
get
Retrieve Downstream Dataset Usage
get
Retrieve a Dataset's Status
get
Stream a Dataset's Records
delete
Truncate a Dataset
Profile Datasets
post
Profile a Dataset
get
Retrieve Profile Information for a Dataset
Manage Attributes
get
List Attributes from a Dataset
get
Retrieve a Specific Attribute
post
Add an Attribute to a Dataset
put
Update an Attribute
delete
Delete an Attribute
Modify Records in a Dataset
post
Insert, Update, or Delete Records in a Dataset
Work with Project Datasets
Manage Projects
post
Create a Project
get
List all Projects
get
Retrieve a Specific Project
put
Update a Project
Work with Input Datasets Within a Project
post
Add an Input Dataset to a Project
get
Retrieve All Input Datasets in a Project
delete
Remove an Input Dataset From a Project
Work with a Project's Unified Dataset
get
Retrieve Project's Unified Dataset
post
Update Unified Dataset
Create Schema Mappings for Attributes
post
Create an Attribute Mapping
get
Retrieve Project's Mappings
delete
Delete an Attribute Mapping
Run Mastering Tasks
Manage Attribute Configuration
get
Retrieve Attribute Configurations
post
Create Attribute Configuration
put
Update an Attribute Configuration
delete
Delete an Attribute Configuration
Stream or Update the Blocking Model
get
Stream a Project's Blocking Model
post
Update a Project's Blocking Model
Create, Retrieve, Predict, and Generate Pairs
post
Generate Pairs
get
Estimate Pairs Count
get
Retrieve Estimated Pairs Count
post
Train Tamr on Current Pairs
post
Predict Pairs
post
Generate High Impact Pairs
Generate, Update, and Publish Clusters
post
Generate Clusters
post
Publish Clusters
get
Retrieve Published Clusters Configuration
put
Update Published Clusters Configuration
get
Retrieve Published Clusters Given Cluster IDs
Run Bulk Match and Low Latency Match Actions
post
Bulk Match Records or Clusters
get
Retrieve Bulk Match Results
post
Perform LLM Match
get
Query LLM Status
post
Update LLM Data
Manage Categories and Taxonomy
Create, Train, and Manage Categories
post
Create Tamr Categorizations
post
Train Tamr on Current Categorizations
get
Retrieve Categories from a Project
get
Retrieve a Category
post
Bulk Create Categories
post
Create a Category
delete
Delete a Category
Manage a Taxonomy
get
Retrieve a Taxonomy
post
Create a Taxonomy
delete
Delete a Taxonomy
List and Retrieve Operations
get
List All Operations
get
Retrieve a Specific Operation
get
Retrieve Supported API Versions
Run Backups and Check Service Health
Manage Storage Providers
get
List all Storage Providers
Check Service Health and Version
get
Get Service Health
get
Get Version
Create a Backup and Restore From a Backup
post
Create a Backup
get
Check Backup Status
post
Restore Tamr from a Backup
get
Check the Restore Status
Use Advanced Actions (Subject to Change)
post
Create or Update User Groups
post
Export Datasets
get
Read the Exported Dataset
post
Procurement Datasets Categorizations
get
Classification Export Model
post
Classification Import Model
Import and Export Dedup Model Dataset
get
Export the Dedup Model
post
Import the Dedup Model
Tamr Versioned API
supported API versions
get
List latest supported API version for each supported major version.
service
get
/service/health
get
/service/version
backups (v1)
get
Get all backups
post
Initiate an asynchronous backup operation
get
Get a backup by ID
post
Cancel a running backup, given its ID
datasets (v1)
get
List all datasets
post
Create a dataset
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Materialize a dataset and its associated views
post
Modify a dataset's records
instances (v1)
get
Get the description of the most recent restore operation, if any
post
Initiate an asynchronous restore operation
post
Cancel the currently running restore
post
Attempts to login using the provided credentials
operations (v1)
get
Get a list of all operations.
get
Fetch an operation.
projects (v1)
get
List all projects
post
Create a project
get
Get a project by ID
put
Update (overwrite) a project
post
Publish clusters
get
Get a project's attribute configurations
post
Create an attribute configuration
delete
Delete an attribute configuration
put
Replace (overwrite) an attribute configuration
get
Get an attribute configuration
get
List all project attribute mappings
post
Create an attribute mapping
delete
Delete an attribute mapping
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Stream the contents of the binning model as JSON records
post
Modify the binning model's records
get
Get configuration related to categorization in this project
put
Update categorization configuration
get
Export categorization labels from a categorization project
post
Import categorization labels into a categorization project
get
Export the categorization model for a categorization project in a zipped format
post
Import the categorization model for a categorization project in a zipped format
post
Train categorization for a categorization project
post
Predict categorization for a categorization project
get
Get the estimated counts of record pairs for each binning model clause
post
Refresh the estimated counts of record pairs for the binning model clauses
post
Refresh Golden Records dataset
post
Update profile for the Golden Records project
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Generate high-impact pairs
delete
Remove an input dataset from a project
get
Get all input datasets in a project
post
Add an input dataset to a project
delete
Remove an input dataset from a project
post
Publish clusters
post
Get all versions of one or more published cluster
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Publish clusters
get
Get published clusters configuration
put
Update published clusters configuration
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Publish clusters
post
Publish the Golden Records output dataset
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Cluster predicted pairs
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Cluster predicted pairs
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Generate pairs for a mastering project
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
post
Train the mastering model
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Predict record pairs
post
Get all versions of one or more published cluster given identifiers of records in them
delete
Delete a taxonomy of a project -- and all categories
get
Get the taxonomy of a categorization project
post
Create a taxonomy and add it to a categorization project
get
Get the categories of a categorization project
post
Create a category and add it to a categorization project
delete
Delete a category by ID -- and all its children
get
Get a category by ID
put
Update a category in a categorization project.
post
Bulk upload categories to a categorization project
get
Get a project's transformations
put
Update a project's transformations
delete
Delete a dataset
get
Get a dataset by ID
put
Overwrite a dataset
get
List a dataset's attributes
post
Create an attribute for a dataset
delete
Delete an attribute
get
Get an attribute of a dataset
put
Update an attribute
get
Get profile info for a dataset, if available
post
Generate dataset profile information if not already generated
delete
Truncate a dataset's records
get
Stream the contents of a dataset as JSON records
get
Report status of a dataset
get
List a dataset's upstream datasets by ID
get
Report usages of a dataset by project steps and downstream datasets
post
Commit the unified dataset of a project
post
Modify a dataset's records
BulkMatch (v1)
get
Fetch match query record results for a Match Results ID.
post
Initiate an asynchronous match query.
LowLatencyMatch (v1)
post
Update low latency match data.
storage providers (v1)
get
List all storage providers