User GuidesAPI ReferenceRelease NotesEnrichment APIs
Doc HomeHelp CenterLog In

Tamr Core Release Notes

These release notes describe new features, improvements, and corrected issues in each Tamr Core release.

See our Security page for the latest security updates and resources.

Tamr Core Releases

Important Information for All Releases

Upgrading Tamr Core to a New Release

Follow the upgrade instructions in the Tamr Core documentation for the version to which you are upgrading. If the version to which you are upgrading has patch releases, Tamr strongly recommends installing the latest patch to ensure you have the latest fixes and security enhancements.

Depending on the version from which you are upgrading, you may need to install required checkpoint releases as part of your upgrade path.

Required Checkpoint Releases

When you upgrade Tamr Core, you must upgrade to each of the intervening checkpoint versions before upgrading to a later version.

The following Tamr Core releases are checkpoint releases:

  • v2022.002
  • v2022.001
  • v2021.002
  • v2020.016
  • v2020.004

The upgrade utility prevents you from upgrading past a checkpoint version. For example, the following upgrade paths are allowed: v2020.017 -> v2021.002, v2021.001 -> v2021.002. These upgrade paths are prevented: v2020.016 -> v2021.003, v2019.019 -> v2021.002.

Installing Patch Releases

For every software release version that has a patch, Tamr strongly recommends that you upgrade to the most recent patch. To install the most recent patch, follow the upgrade instructions and supply the version number of the patch as the version for your upgrade.

If you want to upgrade directly to a patched version, specify the --skipCheckpointReleaseValidation flag when installing Tamr Core. For information, see Upgrading.

Obtaining Release Binary Files

Contact Tamr Support for release binary files for installation and upgrade. Checksums are available for binary validation.

Known Issues

To view current known issues, please consult the Tamr knowledge base.


Tamr Core 2022 Releases

v2022.008.0 Release Notes

New Features and Improvements

The following new features are included in this release.

Add Customized Buttons to the Tamr Core User Interface

Add configuration for customizable buttons for use in Tamr Core extensions. System administrators can now customize Tamr Core to display a toolbar of additional buttons on any page of the user interface. Using a YAML file, you can design the buttons to either redirect users to a different URL or to complete a POST API call. See Adding a Custom Toolbar Button.

Help Tamr Core Learn through Cluster Verification

A new user interface control is now available to enable the learned pairs feature in mastering projects. When enabled, Tamr Core uses the changes that experts make to record clusters to label existing pairs or generate and label new pairs. To enable learned pairs, see Learned Pairs and the feature’s recommended setting.

Other Improvements

For Schema Mapping projects, the editing dialog no longer includes the fields for currency symbol and spend, as these options do not apply to this project type.

Fixed Issues

This release corrects the following error.

Tamr Core UI freezes or crashes when navigating pages and adding categorization labels. Found in: v2022.005.0. Fix versions: v2022.008.0, v2022.005.2. Includes performance improvements on both the front end and backend for how Tamr Core commits updates for categorization feedback.

Back to top


v2022.007.0 Release Notes

Important Support Notes for this Release

This release is not supported for AWS cloud-native deployments.

New Features and Improvements

The following new features are included in this release.

  • Delete and update records API should fail when called on a non-source dataset. Versioned APIs for modifying content of datasets now return an error when used on non-source datasets, preventing users from entering an undesirable state. APIs updated: DELETE/v1/datasets/{datasetId}/records and POST/v1/datasets/{datasetId}:updateRecords
  • Make auxiliary install location configurable. New configuration “TAMR_AUXILIARY_SERVICES_HOME” created. It determines the location where auxiliary service configurations are stored in a Tamr Core installation.
  • Ability to rename unified attributes in user interface. You can now rename unified attributes. After you rename, you must rerun all jobs of a project in order for the name change to propagate to all pages of the project.

Fixed Issues

This release corrects the following errors.

  • Renamed projects can not be retrieved by the new name in versioned API. Found in: v2021.021.0. Fix versions: v2022.007.0. Tamr Core’s versioned API endpoint GET/v1/projects now supports filtering by the versioned API visible project name. Previously, if a project had been renamed, this endpoint would require the original name used for the request.
  • Setting keys in the S3 client does not work on EMR. Found in: v2022.003.0, v2022.004.0, v2022.005.0, v2022.006.0. Fix versions: v2022.007.0. Fixes a bug that prevented jobs from running on EMR.
  • Team city failure: backup restore failed due to big table exception :resource exhausted. Fix versions: v2022.007.0. More robust error quota exception handling for Bigtable backups.

Back to top


v2022.006.1 Patch Release Notes

This patch release corrects the 'argument "src" is null' error that could occur after upgrading from v2019.026.0 or earlier to v2022.003.0 or later. This fix reinstates a schema non-null check in the storage driver.

v2022.006.0 Release Notes

Important Support Notes for this Release

This release is not supported for AWS cloud-native deployments.

New Features and Improvement

The following new features are included in this release.

  • New versioned API endpoints are available to generate test records and high-impact clusters, and to compute cluster accuracy metrics. Three new versioned API endpoints are available, which allow you to continuously monitor model performance as part of a continuous mastering pipeline.
    • POST http://localhost:9100/api/versioned/v1/projects/</a>{project}/testRecords:refresh, which generates test records and clusters for users to curate.
    • POST http://localhost:9100/api/versioned/v1/projects/</a>{project}/trainingClusters:refresh, which generates high-impact clusters for users to curate.
    • POST http://localhost:9100/api/versioned/v1/projects/</a>{project}/clustersAccuracy:refresh, which computes cluster accuracy metrics, including Precision and Recall.
  • New Absolute Cosine similarity function is available. Like cosine similarity, this function applies to text values and represents the similarity between two "bags of words". However, this function does not normalize the resulting feature vectors, so the similarity range is [0, infinity).
  • Disable the ZooKeeper AdminServer which consumes the valuable port 8080. The AdminServer defaults to port 8080, which conflicts with many other services. This feature is not needed; it is now disabled and port 8080 is available.
  • Improve documentation for how COALESCE works with arrays. The description of the COALESCE function now clarifies that arrays that are empty and arrays that contain only nulls are not themselves null. COALESCE returns these arrays as the first non-null element.
  • Curators can edit projects in UI. In addition to using the API, curators can now access a UI control to edit project settings.
  • Curators can delete projects in UI. In addition to using the API, curators can now access a UI control to delete projects.
  • Large Deltas should force non-incremental updates automatically. By default, Tamr now automatically disables incremental updates if there are more than 5% changes since the last update. Tamr continues to respect the setting for the TAMR_DEDUP_DISABLE_INCREMENTAL configuration variable. If this variable is set to true, Tamr disables incremental updates. If this variable is set to false, Tamr also uses the new threshold to determine whether to disable incremental updates.

Fixed Issues

This release corrects the following errors.

  • Backup to GCS fails if directory is empty. Found in: v2021.002.3. Fix versions: v2022.006.0. Added a recursive check in v2022.006 for empty directories before gsutil copy tasks.
  • Connect Profile API endpoint gives 500 error. Found in: v2022.021.0. Fix versions: v2022.006.0. There was a regression where the /api/urlIngest/serverfs/delimited/profile endpoint returned a NotImplementedException starting in core-connect version tamr-core-2021.021.0-3.15.0. The issue has been resolved.
  • Null Pointer Exception trying to cancel snapshot operation. Found in: v2022.001.0, v2022.002.0. Fix versions: v2022.006.0, v2022.002.1. Fixes an issue when canceling a snapshot operation after restarting the service.

Back to top


v2022.005.2 Patch Release Notes

This patch includes performance improvements on both the front end and backend for how Tamr Core commits updates for categorization feedback.

v2022.005.1 Patch Release Notes

This patch release corrects the 'argument "src" is null' error that could occur after upgrading from v2019.026.0 or earlier to v2022.003.0 or later. This fix reinstates a schema non-null check in the storage driver.

v2022.005.0 Release Notes

Important Support Notes for this Release

This release is not supported for AWS cloud-native deployments.

New Features and Improvements

The following new features are included in this release.

  • A new visual transformation, MultiFormula, is available. Use this transformation to apply the same transformation logic to multiple columns.
  • Core Connect is now available. Details follow.
  • Support reading data from and writing data to BigQuery and Salesforce through Core Connect. A separate license is required.

Core Connect Service Available

In past releases, Tamr provided an API-only auxiliary service, df-connect, which enabled developers to import and export data files between Tamr Core and a variety of cloud storage providers. This release integrates this service into Tamr Core as the Core Connect feature, available through the expanded Core Connect API. Interactive Swagger documentation for the Connect API is available at http://<tamr_ip>:9100/docs. To learn more about Core Connect, see the following:

Upgrade Considerations for Current Users of df-connect

  • Current users of df-connect can now use the Core Connect service instead. As part of integrating the df-connect service into Tamr Core, the new Core Connect API is significantly expanded and improved. The default port for Core Connect is 9050, while the df-connect port is 9030. The Core Connect API is also available through port 9100. For example, http://localhost:9100/api/connect/jdbcIngest.

Note: These differences require updates to your import/export scripts.

Before upgrading, you must disable the df-connect auxiliary service. After upgrade, you must update import/export scripts to use the new Core Connect API. See upgrade guidance for df-connect users.

Upgrade Considerations for Current Users of the Data Movement Service

Current users of the Data Movement Service (DMS) API for importing and exporting between Tamr Core and cloud storage can now use Core Connect instead. See upgrade guidance for DMS users.

Note: To import or export files in Parquet format in this release, you must continue to use DMS. See supported file types and cloud platforms.

Supported Database Connections

Core Connect supports connections to many databases. Refer to the Tamr Core documentation for the currently supported databases and driver versions.

Supported File Types and Cloud Platforms

Core Connect supports import and export for the following file types:

  • Avro and delimited files for S3, ADLSGen2, HDFS, GCS, and the server local file system.
  • Newline-delimited JSON files for S3 and server local file system (export only).

Note: Currently, Core Connect does not support Parquet files. To import and export Parquet files, continue to use the Data Movement Service (DMS). Contact Support if you have more questions.

Fixed Issues

This release corrects the following errors.

  • Parquet files ADLSGen2 greater than 2Gb created by DMS cannot be read. Found in: v2021.006.0. Fix versions: v2022.005.0. Fixed in version of DMS that ships with v2022.005. Fixed Parquet writer bug affecting large files, and improved handling of: null, empty array, and [nulls] when passed through Tamr Core.
  • Job duration does not show while job is running. Found in: v2022.002.0, v2022.003.0. Fix versions: v2022.005.0, v2022.002.1. This release adds a UI fix, which enables job duration information to display on the Jobs page.
  • Column expander broken in CSV upload preview UI. Found in: v2021.014.0. Fix versions: v2022.005.0. This release adds a UI fix, which enables expand and shrink columns using a blue vertical line tracker.

Back to top


v2022.004.1 Patch Release Notes

This patch release corrects the 'argument "src" is null' error that could occur after upgrading from v2019.026.0 or earlier to v2022.003.0 or later. This fix reinstates a schema non-null check in the storage driver.

v2022.004.0 Release Notes

Important Support Notes for this Release

This release is not supported for AWS cloud-native deployments.

New Features and Improvements

The following new features are included in this release.

  • Increase default value of TAMR_HTTP_IDLE_TIMEOUT to 300s.
  • EnrichmentComponent does not handle attribute value where first array element is null. This changes the handling of attributes which are arrays (of strings). Previously, if the first element of the array was null, the enrichment failed. Now it uses the first non null element as the value to be enriched. If all elements are null, the array is empty, or the attribute is null, the default value "" is used.

Back to top


v2022.003.1 Patch Release Notes

This patch release corrects the 'argument "src" is null' error that could occur after upgrading from v2019.026.0 or earlier to v2022.003.0 or later. This fix reinstates a schema non-null check in the storage driver.

v2022.003.0 Release Notes

Important Support Notes for this Release

This release is not supported for AWS cloud-native deployments.

New Features and Improvements

The following new features are included in this release.

  • Optimize cluster editing operations in mastering projects. This change increases performance when processing edit requests. Note that when upgrading to this release, you must run an “Update results” job from the Pairs page before you can edit clusters.
  • For DMS, remove ability to select greater than "8" thread count in the UI and API. Tamr supports up to 8 threads for data import when using the Data Movement Service (DMS); the UI and API have been updated to reflect this maximum supported thread count.
  • Remove the Google BigQuery option in the Connect to Source page. Tamr Core has deprecated support for BigQuery, and as of this release the BigQuery option is no longer available in the Connect to Sources page when uploading datasets.
  • Browser support for Chrome and Edge in Windows 7, 8, and 10 for versions going forward. Deprecated browser support for IE11 in all versions of Tamr Core. See Requirements for Installing Tamr Core.

Fixed Issues

This release corrects the following errors.

  • Preview button not working. Found in: v2021.020.0. Fix versions: v2022.003.0. When writing any type of transformation for both input and unified datasets, the “Preview” button in the transformations cell doesn't work.
  • Bootstrapping Do Not Map attributes should not create empty unified attributes. Fix versions: v2022.003.0. Do Not Map attributes are now ignored when bootstrapping multiple source attributes.
  • Token weighting should be hidden in categorization projects. Fix versions: v2022.003.0. Because token weighting is not utilized for categorization projects, the option to select token weighting for machine learning attributes has been removed in Schema Mapping for these projects.

Back to top


v2022.002.1 Patch Release Notes

This patch release corrects the following issues.

  • Adds a prompt during upgrade if the --exportHBaseSnapshots option is not included.
  • Null Pointer Exception trying to cancel snapshot operation. Fixes a bug when canceling a snapshot operation after restarting the service.
  • Disable the ZooKeeper AdminServer which consumes the valuable port 8080.
  • This release adds a UI fix, which enables job duration information to display on the Jobs page.

v2022.002.0 Release Notes

New Features

The following new features are included in this release.

New Checkpoint Releases

Tamr Core releases v2022.001.0 and v2022.002.0 are checkpoint releases. When you upgrade Tamr Core, you must first upgrade to v2022.001.0, and then v2022.002.0, before upgrading to a greater version.

Upgrade to HBase 2.x Client

This release includes an upgrade of the HBase Java libraries used by Tamr Core from 1.3.1 to 2.2.3. Additionally, the version of HBase that is installed on single-node instances has been upgraded from 1.3.1 to 2.3.6. See Upgrading Tamr Core. If you are upgrading a cloud-native deployment, please contact Tamr Support for guidance.

❗️

Important

  • For single-node deployments, you must provide an additional flag, --exportHBaseSnapshots, to the admin utility (unify-admin.sh) during upgrade. To prevent data corruption, see prerequisites before upgrade.
  • Upgrading HBase versions requires significant upgrade time; expect upgrade to take longer than usual for this release. Upgrade time is highly dependent on the number of projects in your pipeline. For example, if you have 20 projects, expect that upgrade to take at least 3 hours.

Improvements

  • Schema mapping projects: user interface improvement for mapping suggestion counts. The number next to lightbulbs now indicates the top suggested mappings for an attribute at the specified similarity threshold.

Fixed Issues

This release corrects the following errors.

  • Schema mapping suggestion counts are zero or negative for some attributes. Found in: v2021.015.0. Fix versions: v2022.002.0.

Known Issues

Note: For this release, IAM role-based authentication for S3 on DMS storage is not supported on EC2 instances. Tamr recommends using a service principal to import or export data from AWS.

Back to top


v2022.001.2 Patch Release Notes

This patch release corrects an issue in which upgrade to v2022.001.1 succeeds, but recipe upgrade for projects fails. Found in: v2022.001.1.

Release v2022.001.0 added edit checking to ensure that project and dataset names do not include the characters /, \, or :, a leading ., or leading or trailing white spaces. This patch identifies projects with names that include these characters or spaces and removes them from the project names.

Contact Tamr Support ([email protected]) for assistance if your dataset names include these characters or spaces.

v2022.001.1 Patch Release Notes

This patch release corrects the following issues.

  • Running into malformed YAML issue when upgrading from v2021.006 to v2022.001.

v2022.001.0 Release Notes

New Features and Improvements

The following new features are included in this release.

  • Upgrade the bundled JDK version.
  • Show correct empty state when using the filters in Dataset Catalog page. When using the "Results and Internals" or "System" filter in Dataset Catalog, the message shown in the empty state is now “No datasets matching your filters.”
  • Add disk space available check to upgrade utility. Validation scripts now include a utility to verify that at least 20% of disk space is available when starting or upgrading Tamr Core.

Fixed Issues

This release corrects the following errors.

  • Clustering job stuck. Found in: v2021.010.2. Fix versions: v2022.001.0.
  • Validation script does not correctly identify disk usage scenarios that will break Tamr Core. Found in: v2021.019.0. Fix versions: v2022.001.0.
  • Tamr Core now enforces that project names cannot include the ‘/', ‘\’, ':’ characters or leading or trailing white spaces. Fix versions: v2022.001.0.

Back to top