Tamr Documentation

v2019.008 Notes

Tamr Unify release notes.

What's New

General

  • Implement fixed ratio Spark cores and memory allocation
Total memory available for Spark
Driver memory
Executor JVM instances
Total memory available for executors
Executor cores

19g or less

3g, 1 core

1

Total available

Max available but no more than 1 core per 2g executor memory

35g or less

3g, 1 core

2

Total available/2

As above

51g or less

3g, 1 core

3

Total available/3

Less than 67g

3g, 1 core

4

Total available/4

As above

67g and greater

3g, 1 core

4

16g

As above

Unknown (remote Spark cluster)

3g, 1 core

4

16g

8

Less than 9g

1g, 1 core

1

Total available

Max available but no more than 1 core per 1g executor memory

  • Do not run local instance of Spark if remote Spark is used
    -- There is a new boolean configuration property called TAMR_REMOTE_SPARK_ENABLED. This property is false by default, and if set to true, start-dependencies.sh will not start a local Spark process.
    -- Important: This variable is not automatically set on upgrade, so if you are using a remote Spark cluster (Yarn, Dataproc, etc.), manually set this property to true.
  • Ability to cancel running Spark jobs
    -- You can now cancel submitted and running Spark jobs in addition to pending jobs.
    -- Cancellation is a best-effort and asynchronous action. The job might have succeeded or failed before it gets to the cancellation, which cause the cancellation to fail.

General Improvements and Major Bug Fixes

General

  • When opening the “Add new CSV” dialog in the “Add a new dataset” dialog on the “Datasets” page in Transformations, the file picker opens automatically.
  • Set the initial number of HBase regions to 1
  • Clarify description of unified dataset attributes on Schema Mapping page
  • Jobs running progress bar resized
  • Updated Unify log collection script

Mastering

  • No longer cause write lock exception on mastering dataset running update results job if another mastering job was currently running.
  • Publish clusters job now lists project it was run on in project column on jobs page.

Transformations

  • Transformations can be previewed on custom dataset samples.
    -- To configure this please follow the steps in this page: Setting a custom preview sample.
    -- Note: configuring this functionality is API-only (though the effects can be seen in the UI)

Upgrade

See upgrading page for instructions.

v2019.008 Notes


Tamr Unify release notes.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.