Tamr Documentation

v2019.016 Notes

Tamr Unify release notes.

What's New

  • User-centric view of policies and access to datasets and projects. Administrators can now audit user policies and access to datasets and projects per user.
  • Ability to specify preffered input datasets, for each attribute in the golden records. Golden records now include a dataset filter for each attribute. The dataset filter allows you to specify preffered input datasets.

Major Improvements and Fixes

Access Control

  • The Users page now shows a table with the Member of Policy column for each user. This column replaces the Roles column. You can see the number of policies for each user and use the link to obtain more information. Selecting the Member of Policy link displays a detailed report of all policies, users who use them, and resources in each policy. This report is per user. This allows you, as an administrator, to audit user policies, and review access to datasets and projects by each user. In the same report, you can add and remove the user from a particular policy. Adding the user to a policy allows you to choose one of the predefined user roles, curator or reviewer.
  • Improved the navigation bar to reflect user roles. In particular, administrators see links to jobs, the dataset catalog, users and groups, and permissions, and curators and reviewers only see the link to jobs.

Golden Records

The list of filter conditions for attributes in golden records now includes a new condition type, dataset. Choosing a dataset condition allows you to configure a filter for an attribute in golden records that selects the value for this attribute from the list of one or more preferred input datasets, for each attribute. If the list of input datasets changes, the newly added datasets in the drop-down list are marked with the "New" label. This can happen if new input datasets were added to an existing Mastering project.

Configuration

  • Yarn. Added a TAMR_YARN_TEMP_DIR cofiguration property that allows you to explicitly configure the directory for storing temporary files in Yarn. This is useful if you need to control access to it. If not specified, the default directory is set by the hadoop.tmp.dir property.
  • HBase. Added TAMR_LAUNCHER_INTERVAL (in minutes) and TAMR_HBASE_CHMOD_ENABLED configuration properties to tamr/conf/unify-custom-config.yaml. Before you install, Tamr recommends that you either deploy the software on a server that has single-user access, or control access by setting these properties as follows: TAMR_LAUNCHER_INTERVAL: "1" TAMR_HBASE_CHMOD_ENABLED: "true".

Other Improvements and Fixed Issues

  • The Policy page is now responsive to user actions. In the previous release, delays were observed that are now fixed.
  • Improved the behavior of user comments, and added "Last modified" timestamp to user comments.
  • Fixed an issue in the Mastering project where searching for users (to assign records to them) was previously case-sensitive.
  • Fixed an issue in Clustering where users could not lock and unlock an individual record in the records table.
  • Fixed an issue in categorization where the Clear All Filters button was missing for records with the high impact filter.
  • Geospatial records support. Fixed an issue where for records with the exact match on the geospatial attribute, the link did not display. Improved the pair matching UI for geospatial records.
  • Improved performance of golden records with respect to the number of attributes and filters.
  • The tie-breaking behavior of the golden record rule most common value now takes the minimum value through an ascending sort. For example, if “apple” and “banana” are the most common values in the group, the rule chooses “apple”. Previously, the behavior was inconsistent for two values with the same counts.

Support Tickets

  • Fixed an issue where searching for users in the Mastering project (to assign records to them) was previously case-sensitive.
  • Fixed an issue where profiling jobs failed with a Snappy (compression) error: "Could not initialize class org.xerial.snappy.Snappy", by adding the ability to specify the temp directory to Yarn for applications that require acces, TAMR_YARN_TEMP_DIR.
  • Fixed an issue with sorting of users by their full name in CJK (Chinese, Japanese and Korean) characters in the assignment dialog box. In the previous releases, this sort did not work for ASCII characters.
  • The SAML property RequestedAuthenticationContext is now configurable via the Tamr configuration variable TAMR_SAML_AUTH_COMPARISON_TYPE.
  • Fixed an issue where the logical condition in golden record rule filters reported an incorrect record count in clusters.

Known Issues

The following known issues exist in this release.

Jobs management

  • The job for Updating results is not showing the project it is associated with on the Jobs page.

Geospatial support

  • Cannot update golden records with a geospatial data type present.
  • A warning displays in the browser console for any page or details panel that contains the geospatial record types in mastering projects.
  • The Image for polygons is cut off (not adjusted to scale) on the Pairs page.
  • Error when attempting sorting by a geospatial type attribute.
  • Most frequent values are not showing up for a geospatial type field on a profiled attribute on the Schema Mapping page.
  • The schema mapping suggestion fails if a geospatial record type attribute is present.
  • Export is failing on a dataset with geospatial data.

Upgrade

See upgrading page for instructions.

v2019.016 Notes


Tamr Unify release notes.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.