Tamr Documentation

Release Notes

Notes for Tamr v2020.14.0 and v2020.13.0

Release Notes for v2020.14.0
Release Notes for v2020.13.0

Release Notes for v2020.14.0

These release notes list what's new in this release, improvements, and bug fixes.

What's New

In this release, the following notable changes were made:

  • Released version 0.12.0 of the Pyton tamr-client. For more information, see tamr-client 0.12.0 and Tamr Client documentation.
  • API changes. Added a new parameter, expectedVersion to the POST /datasets/{name}/update endpoint in the dataset service, to allow consistent dataset updates.
  • Usability and design improvements.
    • Observe in the tooltip that profiling value counts are estimates, when examining results of profiling a dataset.
    • Use the Rules tab, when working with the golden records project as a curator.
  • Performance and configuration improvements. Take advantage of improved HBase performance when processing Tamr jobs.

Fixed Issues

  • Fixed an issue where the "read only" permissions on /tmp prevented Tamr dependencies from starting. Affects versions: v2020.008.1. Fix versions: v2020.014.0.
  • Fixed an issue where Tamr instance was broken after deleting an input dataset. Affects versions: v2020.008.1. Fix versions: v2020.014.0.
  • Fixed an issue where deleting the unified dataset caused Tamr to throw a nullpointer exception. Affects versions: v2020.008.0. Fix versions: v2020.014.0.
  • Fixed an issue with profile value counts to indicate that they are estimates. Affects versions: v2020.009.0. Fix versions: v2020.014.0.
  • Fixed an issue in the golden records project where it did not show the Rules tab to curators. Affects versions: v2020.006.0. Fix versions: v2020.014.0.
  • Fixed an issue where Postgres Prometheus configuration used HOST_IP instead of TAMR_POSTGRES_HOSTNAME.
  • Added TAMR_PERSISTENCE_EXPORTER_USER and TAMR_PERSISTENCE_EXPORT_PASS to the configuration definitions.

Release Notes for v2020.13.0

These release notes list what's new in this release, improvements, and bug fixes.

What's New

In this release, you can:

  • Rely on faster running Spark jobs due to HBase configuration improvements.
  • Avoid dataset errors when updating or publishing golden records due to improved dataset validation checks.
  • Collect logs for a specified time period using a new flag on the collect-logs.sh script.

Improvements and Changes

  • Upgraded versions of Grafana to 6.3.4 and Kibana to 5.6.16 . Affects versions: v2020.004.0. Fix versions: v2020.013.0, v2020.004.2.
  • HBase. Stopped blocking new jobs while HBase rollback is in progress.
  • HBase. Adjusted the buffer to store enough records for sorting streaming updates to HBase.
  • Allowed LLM and Bulk Matching on projects with Mastering functions and user-defined signals. Affects versions: v2020.002.0. Fix versions: v2020.013.0.

Fixed Issues

The following issues were fixed in this release.

  • Updated collect-logs.sh to accept an age field to enable collecting logs for only a certain number of days. Affects versions: All. Fix versions: v2020.013.0.
  • Fixed the log pruning scripts to set dependencies correctly.
  • Fixed an issue where the user policy management dialog deselected datasets as you paginate.
  • Fixed an issue where you could not edit project and dataset user policies without deselecting other datasets in the policy. Affects versions: All. Fix versions: v2020.013.0.
  • Fixed an issue in working with geospatial data, where displaying multi-point data in a Leaflet map caused the user interface to blank out. Affects versions: v2020.004.1. Fix versions: v2020.013.0, v2020.004.2.
  • Fixed an issue where you could not import pair labels when pre-grouping feature was enabled. Affects versions: v2020.009.0. Fix versions: v2020.013.0.
  • Fixed an issue where removing a source dataset did not remove it from pair exclusions in the internal configuration. Affects versions: v2020.004.1. Fix versions: v2020.013.0.
  • Made estimate pairs sampling configurable in internal interfaces. Affects versions: All. Fix versions: v2020.013.0.
  • Reduced indexing unneeded internal datasets when Elasticsearch is disabled. Affects versions: v2020.004.1. Fix versions: v2020.013.0.

Documentation Changes

Beginning with Tamr v2020.013.0, documentation versions available at docs.tamr.com are listed as ranges of versions.

  • Documentation version ranges map to the development releases contained within the range.
  • For example, Tamr documentation version 2020.13.0-2020.16.0 (this current range) maps to four consequtive development releases.
  • At the time of this writing, only the first of these development releases is available, Tamr v2020.013.00. The other releases in this range will become available in the future.
  • The documentation for versions in the range is updated in place and republished.
  • The release notes for each development release continue to be published.
  • For information about deltas between individual development releases, see the release notes for each development release. Also see the Changelog for a running list of release notes.
  • The documentation version scheme differs slightly from the development version scheme in that it does not use leading zeros in its numbers. For example, Tamr development version 2020.013.0 is represented as the documentation version 2020.13.0 (there is a missing zero in front of 13). This is by design and the two version notations map to each other.

Known Issues

The following known issues exist in this release.

  • Status field (text and icon) on the Jobs page is not centered and the icon is truncated.
  • Mapped/Unmapped attribute filters are not working on any downstream project in a project with chained datasets, after an upgrade to v.2019.023.1 and greater. If you encounter this issue, contact Tamr Support for information about a workaround (running an internal-only API request that calculates attribute mappings in this case).
  • Column resizing on the Users page does not behave as expected.
  • The schema mapping project is not showing out-of-dateness for projects.
  • The Unified Dataset page throws an error in the user interface when you are logged in as a reviewer.
  • Job submission for chained projects may not appear immediately on the Jobs page after choosing Submit. Submit is not disabled in this case. Pre-processing of dataset versions takes place before Tamr submits the jobs to Spark and Tamr is not currently accounting for this time on the Jobs page.
  • The job for Updating results is not showing the project it is associated with on the Jobs page.
  • The upgrade process updates all record pair feedback to use unified record IDs instead of origin record IDs. This process runs automatically when upgrading. However, this process depends on Elasticsearch index being up-to-date for the unified dataset before you start an upgrade process. In cases where the index is not up-to-date at the time of upgrading to version v.2019.024 or greater, the upgrade process will have no effect and the pre-upgrade pair feedback will not be migrated or deleted. As a workaround, before you upgrade, index the unified dataset in Elasticsearch, and after you upgrade, run the following endpoint manually: /api/dedup/pairs/feedback/migrate, and then run the job that updates pairs for your project.

Known Issues with Geospatial Support

  • Cannot update golden records with a geospatial data type present.
  • A warning displays in the browser console for any page or details panel that contains geospatial record types in mastering projects.
  • Error when attempting sorting by a geospatial type attribute.
  • Most frequent values are not showing up for a geospatial type field on a profiled attribute on the Schema Mapping page.
  • The schema mapping suggestion fails if a geospatial record type attribute is present.
  • Export is failing on a dataset with geospatial data.

Upgrade

For information, see Upgrading Tamr.

Updated 4 days ago


Release Notes


Notes for Tamr v2020.14.0 and v2020.13.0

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.