Tamr Documentation

Configuration Variable Reference

Complete list of Tamr configuration variables.

Properties for Configuration Variables

Tamr configuration variables have the following properties.

  • machineSpecific
    If False, backup operations include the value defined for this variable and the corresponding restore operation will restore it. If True, the value defined for this variable is not saved to the backup or restored.
  • dependencies
    Identifies whether the variable affects a Tamr microservice or an external service. If ['unify'], the variable affects Tamr. If ['supporting'], the variable affects a dependency outside of the Tamr software, such as HBase, Elasticsearch, ZooKeeper, Spark/YARN, or PostgreSQL.
  • formula
    The Java class used to calculate the default value for this environment variable. If present, this variable is set based on a calculation that can rely on the settings for other configuration variables.
  • secure
    If True, the value defined for this variable is stored in encrypted form. If False, the value is stored in plain text.

APPS_DMS_ENABLED

Set to true to enable the Data Movement Service (DMS). Be sure to set APPS_DMS_DEFAULT_CLOUD_PROVIDER specifying the cloud provider to use with DMS. Currently you can only specify one default cloud provider for the UI. To use alternative cloud providers, refer to the DMS API.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify', 'supporting']
  • secure: False

APPS_DMS_DEFAULT_CLOUD_PROVIDER

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines the default cloud service provider for DMS. Valid values include "GCS", "S3", or "ADLS2".

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

APPS_DMS_MAX_CONCURRENT_REQUESTS

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines how many DMS jobs can run at the same time. Running more jobs concurrently takes more memory, see APPS_DMS_MEMORY. Parquet files are read one row group chunk at a time and require more memory per thread. If you use parquet and run into Out Of Memory errors, try reducing the number of threads set by the sinkConfig of your API call, decreasing the number of concurrent requests, increasing APPS_DMS_MEMORY or recreating the parquet file with a smaller row group.

DEFAULT VALUE

2

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

APPS_DMS_MEMORY

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines the amount of memory allocated to the DMS driver. As a rule, APPS_DMS_MEMORY should be set to APPS_DMS_MAX_CONCURRENT_REQUESTS 2 (max row group size of parquet data, default is 128 MB) (# of threads used to write to tamr per job, default is 8). If you don't have any parquet files, APPS_DMS_MAX_CONCURRENT_REQUESTS 2 10m (# of threads used to write to tamr). A good default for csv files is 320M instead of 2G.

DEFAULT VALUE

2G

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

APPS_DMS_HOSTNAME

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines the host of the DMS. Don't change this unless you have an external DMS.

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

APPS_DMS_PORT

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines the port for the DMS. Don't change this unless you have an external DMS.

DEFAULT VALUE

9155

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

APPS_DMS_SCHEME

For cloud-native Tamr deployments with APPS_DMS_ENABLED set to true, defines whether the DMS runs on http or https. Don't change this unless you have an external DMS.

DEFAULT VALUE

http

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

ES_FIELDDATA_CACHE_SIZE

DEFAULT VALUE

20%

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

ES_HEAP_SIZE

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.EsHeapSize
  • secure: False
  • description: None

ES_NUM_SHARDS

The number of shards to set when creating new Elasticsearch indexes. This variable is mainly intended for use with the local Elasticsearch cluster. Default value is the number of cores on the local host machine. For details, see: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/index-modules.html

DEFAULT VALUE

{{ TAMR_TOTAL_CORES }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

HOST_IP

Set to 127.0.0.1 (loopback), internal IP, or external IP. To determine the setting that is correct for your infrastructure, Tamr recommends testing each option in a sandbox environment.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.HostIp
  • secure: False

JAVA_HOME

DEFAULT VALUE

{{ TAMR_JAVA_HOME }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ADLS_AUTH_TOKEN_URL

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_ADLS_CLIENT_ID

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_ADLS_CLIENT_KEY

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_ADLS_FQDN

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_ADLS_GEN2_ACCOUNT_NAME

Name of Azure storage account containing the ADLS Gen2 instance https://docs.microsoft.com/en-us/azure/storage/common/storage-account-overview

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ADLS_GEN2_CONTAINER_NAME

Name of Data Lake Storage container in the storage account used for ADLS Gen2

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ADLS_GEN2_CLIENT_ID

Client ID for service principal for connecting to ADLS Gen2 https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal#get-tenant-and-app-id-values-for-signing-in

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_ADLS_GEN2_CLIENT_SECRET

Service principal secret for connecting to ADLS Gen2 https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal#option-2-create-a-new-application-secret

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_ADLS_GEN2_TENANT_ID

Tenant ID for Azure service principal for connecting to ADLS Gen2 https://docs.microsoft.com/en-us/azure/active-directory/develop/howto-create-service-principal-portal#get-tenant-and-app-id-values-for-signing-in

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_API_FACADE_MAX_CONCURRENT_REQUESTS

DEFAULT VALUE

40

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_ADDITIONAL_CREDENTIAL_FACTORIES

A list of |||-separated JSON dictionaries, where each dictionary comprises an LDAP domain.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_ADMIN_BIND_PORT

DEFAULT VALUE

9021

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_ADMIN_PORT

DEFAULT VALUE

9021

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_AUTH_BIND_PORT

DEFAULT VALUE

9020

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_LDAP_ADMIN_DN

The LDAP domain name for the user account Tamr should use to connect to LDAP. This value must be a fully-qualified domain name.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_ADMIN_PASSWORD

The password of the user account Tamr should use to connect to LDAP.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_AUTH_LDAP_GROUP_BASEDN

The base domain name for the groups Tamr should query.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_GROUP_FILTER

The filter expression Tamr should apply when querying groups.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_GROUP_IDATTR

The name of the unique identifier attribute of the LDAP groups Tamr should use.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_HOST

The fully-qualified hostname of the LDAP server.

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_PORT

The port number of the LDAP server.

DEFAULT VALUE

389

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_SECURE

Specifies whether to enable Tamr to connect over LDAPS.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_BASEDN

The base domain name for the users Tamr should query.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_DNPATTERN

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_LDAP_USER_EMAILATTR

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_LDAP_USER_FILTER

The filter expression Tamr should apply when querying for a list of users.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_FINDER

Stores a conditional expression that uses a user's ldap attributes, in addition to their credentials, to restrict them from authenticating.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_GIVENNAMEATTR

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_LDAP_USER_IDATTR

The name of the LDAP attribute that Tamr matches a username against.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_MEMBEROFATTR

The name of the LDAP attribute containing the group membership of a user.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_AUTH_LDAP_USER_SURNAMEATTR

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_AUTH_PORT

DEFAULT VALUE

9020

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BACKUP_CLOUD_SQL_ENABLED

Whether to use Cloud SQL's native backup API to back up postgres when using Cloud SQL.

DEFAULT VALUE

True

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_FS_CONFIG_DIR

DEFAULT VALUE

{{ TAMR_FS_CONFIG_DIR }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BACKUP_FS_CONFIG_URIS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BACKUP_FS_EXTRA_CONFIG

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_BACKUP_FS_EXTRA_URIS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BACKUP_FS_KERBEROS_ENABLED

If set to true, configure TAMR_KERBEROS_KEYTAB, TAMR_KERBEROS_PRINCIPAL, and TAMR_KERBEROS_KRB5 also.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify', 'supporting']
  • secure: False

TAMR_FILE_BASED_HBASE_BACKUP_ENABLED

Whether to backup contents of HBase root directory to backup path.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_AWS_CLI_ENABLED

Whether to use the AWS S3 command line utility when backing up to or from S3. Note that the AWS S3 CLI will only be used when this is true and it is installed locally.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_S3DISTCP_ENABLED

Whether to run s3distcp on the deployment's static EMR cluster (see TAMR_BACKUP_EMR_CLUSTER_ID) when backing up to or from S3.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_EMR_CLUSTER_ID

ID of the static EMR cluster to run s3distcp on when backing up to or restoring from S3.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_GSUTIL_ENABLED

Whether to use the GCS command line utility gsutil when backing up to or from GCS. Note that gsutil will only be used when this is true and it is installed locally.

DEFAULT VALUE

True

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BACKUP_GSUTIL_EXTRA_ARGS

These extra command line options will be provided to gsutil when using it to backup/restore files.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_BACKUP_NATIVE_ENABLED

When enabled while using Bigtable, backups will use Bigtable's native backup API rather than exporting to archive files. The native API is much faster than export, but native backups are less portable because they can only be restored into the same instance of Bigtable where they were created. Default is true (native backups enabled).

DEFAULT VALUE

True

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_BACKUP_NATIVE_TTL

When Bigtable native backup is enabled (see TAMR_BIGTABLE_BACKUP_NATIVE_ENABLED), this configures the expiration time, in days, of each backup relative to its creation (its Time To Live). The minimum allowed is 1 day, and the maximum is 30 days. Default is 14 days.

DEFAULT VALUE

14

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_EXTRA_CONFIG

DEFAULT VALUE

{}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BIGTABLE_CLUSTER_ID

Cluster ID of the only cluster in the Bigtable instance. We assume only one cluster is present in the instance. Required when Bigtable dynamic is enabled.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_DOWNSCALE_DELAY

Time in minutes to wait after the job queue empties and all executing jobs are complete before downscaling the Bigtable cluster. Only relevant if dynamic scaling is enabled. Default is 1 (i.e., one minute).

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_INSTANCE_ID

The Bigtable instance ID. Required to use Bigtable.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_MAX_NODES

The maximum number of nodes to upscale the cluster to. Must be set to a value greater than TAMR_BIGTABLE_MIN_NODES. Only relevant if dynamic scaling is enabled. Default is 0 (must be overridden).

DEFAULT VALUE

0

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_MIN_NODES

The minimum number of nodes to downscale the cluster to. Must be positive. It is up to the user to ensure this value is big enough to fit all data stored in the cluster. If this value is set too small, then the system may be unable to downscale. Only relevant if dynamic scaling is enabled. Default is 1.

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_PROJECT_ID

GCP Project ID of the Bigtable instance. Required to use Bigtable.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_SCALING_BUFFER_SECONDS

Minimum number of seconds before scaling down the Bigtable cluster again. Latency increases caused by scaling down too quickly. When you decrease the number of nodes in a cluster to scale down, try not to reduce the cluster size by more than 10% in a 10-minute period. Scaling down too quickly can cause performance problems, such as increased latency, if the remaining nodes in the cluster become temporarily overwhelmed.

DEFAULT VALUE

600

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_SCALING_ENABLED

Whether to enable dynamic cluster size scaling. Default is false.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_BIGTABLE_SCALING_FACTOR

Percentage of the cluster to spin down at a time. Default is 10%. Valid range is (0,1]

DEFAULT VALUE

0.1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CATEGORIZATION_FEATURE_SCALING

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_CATEGORIZATION_GRADIENT_DESCENT_ITERATIONS

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_CATEGORIZATION_MAX_MATRIX_SIZE_IN_MEM

DEFAULT VALUE

40000000

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_CATEGORIZATION_REGULARIZATION_PARAMETER

DEFAULT VALUE

0.1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_CATEGORIZATION_STRENGTH_THRESHOLD_HIGH

With TAMR_CATEGORIZATION_STRENGTH_THRESHOLD_MEDIUM, used to define the level of confidence for a Tamr suggestion in a categorization project as High, Medium, or Low. H, M, and L icons indicate the confidence level for each suggestion, and users can filter by confidence level. The value must be between 0 and 1.

DEFAULT VALUE

0.7

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CATEGORIZATION_STRENGTH_THRESHOLD_MEDIUM

With TAMR_CATEGORIZATION_STRENGTH_THRESHOLD_HIGH, used to define the level of confidence for a Tamr suggestion in a categorization project as High, Medium, or Low. H, M, and L icons indicate the confidence level for each suggestion, and users can filter by confidence level. Suggestions with a level below the value set for this variable are reported as low strength. The value must be between 0 and 1.

DEFAULT VALUE

0.4

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CLUSTER_CONFIDENCE_THRESHOLD_HIGH

With TAMR_CLUSTER_CONFIDENCE_THRESHOLD_MEDIUM, used to define the level of confidence for a Tamr suggestion in a mastering project as High, Medium, or Low. H, M, and L icons indicate the confidence level for each suggestion, and users can filter by confidence level. The value must be between 0 and 1.

DEFAULT VALUE

0.85

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CLUSTER_CONFIDENCE_THRESHOLD_MEDIUM

With TAMR_CLUSTER_CONFIDENCE_THRESHOLD_HIGH, used to define the level of confidence for a Tamr suggestion in a mastering project as High, Medium, or Low. H, M, and L icons indicate the confidence level for each suggestion, and users can filter by confidence level. Suggestions with a level below the value set for this variable are reported as low strength. The value must be between 0 and 1.

DEFAULT VALUE

0.7

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CONGLOMERATE_MEMORY

DEFAULT VALUE

4096m

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_CONNECTION_INFO_TYPE

The type of HBase client connection that Tamr uses. Set to: - "hbase" to specify the Zookeeper quorum for connection and configuration. - "hbase-site" to use an XML file, such as hbase-site.xml, for connection and configuration. - "bigtable" to use bigtable instead of hbase for the connection.

DEFAULT VALUE

hbase-site

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_CONSOLE_LOGGING_FILTER

DEFAULT VALUE

DENY

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_CORS_ALLOWED_HEADERS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

Content-Type,Authorization,X-Requested-With,Content-Length,Accept,Origin

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_ALLOWED_METHODS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

GET,OPTIONS,POST,PUT,HEAD,DELETE,PATCH

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_ALLOW_CREDENTIALS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_AUTH_ALLOWED_ORIGINS

Accepts one or more URLs separated by commas to identify external resources for cross-origin resource sharing (CORS) management. For example, http://origin1,http://origin2.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_DATASET_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_DEDUP_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_EXPOSED_HEADERS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_MATCH_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_PERSISTENCE_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_PREFLIGHT_MAX_AGE

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

1800

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_PREVIEW_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_PUBAPI_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_RECIPE_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_TAXONOMY_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_TRANSFORM_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_CORS_UNIFY_ALLOWED_ORIGINS

Used to manage cross-origin resource sharing (CORS).

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_ADMIN_BIND_PORT

DEFAULT VALUE

9151

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DATASET_ADMIN_PORT

DEFAULT VALUE

9151

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_DATASET_BIND_PORT

DEFAULT VALUE

9150

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DATASET_EMR_ACCESS_SECURITY_GROUP

Security group for internal cluster communication.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CORE_VOLUME_COUNT

Number of EBS volumes for each core instance.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CORE_VOLUME_SIZE

The volume size, in gibibytes (GiB). This can be a number from 1 - 1024.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CORE_VOLUME_TYPE

The volume type. Volume types supported are gp2, io1, standard.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CLUSTER_TAGS

Comma separated list of tags(key-value) for cluster. Example: key1,value1,key2,value2

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CLUSTER_NAME_PREFIX

Prefix for ephemeral EMR cluster name. in format <prefix>-.

DEFAULT VALUE

tamr-emr-

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_CUSTOM_AMI_ID

The ID of a custom Amazon EBS-backed Linux AMI. If specified, Amazon EMR uses this AMI when it launches cluster EC2 instances.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_INSTANCE_COUNT

Number of instances for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_INSTANCE_PROFILE

Job flow role for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_INSTANCE_TYPE

Type of instance (master/slave) for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_KEY_NAME

EC2 key name for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_LOG_URI

Log URI for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_INSTANCE_TYPE

The EC2 instance type for the master instance.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_SECURITY_GROUP

Primary security group for the master instances.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_SECURITY_GROUP_ADDITIONAL

Comma separated list of additional security groups to add to the master instances.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_VOLUME_COUNT

Number of EBS volumes for the master instance.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_VOLUME_SIZE

The volume size, in gibibytes (GiB). This can be a number from 1 - 1024.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_MASTER_VOLUME_TYPE

The volume type. Volume types supported are gp2, io1, standard.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_RELEASE

EMR release label for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_ROOT_VOLUME_SIZE

The size, in GiB, of the EBS root device volume of the Linux AMI that is used for each EC2 instance. Available in Amazon EMR version 4.x and later.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_RUN_JOB_FLOW_REQUEST

Serialized RunJobFlowRequest object. This formula uses many of the other "TAMRDATASET_EMR" configuration values to create this serialized object. Setting this field directly will bypass the formula and all of those configuration values.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrEmrRunJobFlowRequest

TAMR_DATASET_EMR_SERVICE_ROLE

Service role for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_SUBNET_ID

EC2 Subnet id for ephemeral EMR cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_WORKER_SECURITY_GROUP

Primary security group for the worker instances.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_EMR_WORKER_SECURITY_GROUP_ADDITIONAL

Comma-separated list of additional security groups to add to the master instances.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_GC_POLLER_DELAY_BETWEEN_RUNS_SECONDS

DEFAULT VALUE

300

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEFER_INITIALIZATION

Defers initialization of external services until they are used. Currently only applies to Spark

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_EXTERNAL_DATASET_POLLER_DELAY_BETWEEN_RUNS_SECONDS

The interval (in seconds) between attempts to check external datasets for updates.

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DATASET_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DATASET_NUMBER_OF_VERSIONS_TO_KEEP

With TAMR_DATASET_VERSION_TIME_TO_LIVE_IN_MINUTES specifies the garbage collection policy that is applied to all datasets (with the exception of published clustering datasets). Defaults to 5 versions.

DEFAULT VALUE

5

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DATASET_PORT

DEFAULT VALUE

9150

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DATASET_USE_COPY_EVENT_LOGGER

Use a version of the spark event logger that closes the file stream between writes. Required when using certain filesystems, such as S3 or DBFS, that do not make the contents of a file available for read until the stream is closed.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrUseCopyEventLogger
  • secure: False

TAMR_DATASET_VERSION_TIME_TO_LIVE_IN_MINUTES

With TAMR_DATASET_NUMBER_OF_VERSIONS_TO_KEEP specifies the garbage collection policy that is applied to all datasets with the exception of published clustering datasets. Defaults to 0 minutes.

DEFAULT VALUE

0

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HUMAN_DATASET_NUMBER_OF_VERSIONS_TO_KEEP

Specifies the number of versions to keep for human datasets. Defaults to 1000 versions.

DEFAULT VALUE

1000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DEDUP_ADMIN_BIND_PORT

DEFAULT VALUE

9141

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_ADMIN_PORT

DEFAULT VALUE

9141

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_DEDUP_BIND_PORT

DEFAULT VALUE

9140

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MAX_ROWS_PER_PARTITION

DEFAULT VALUE

100000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_DISABLE_INCREMENTAL

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_NUM_QUESTIONS

DEFAULT VALUE

50

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_PORT

DEFAULT VALUE

9140

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_DEDUP_VIEW_STORAGE

The storage type for the UI data in the dedup service. Possible values are ELASTICSEARCH and BIGQUERY.

DEFAULT VALUE

ELASTICSEARCH

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_DEFAULT_SESSION_TIMEOUT

DEFAULT VALUE

86400

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ELASTIC_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/elastic_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ELASTIC_EXPORTER_PORT

DEFAULT VALUE

9130

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ELK_ENABLED

Set to 'true' to use the Kibana service for Tamr logs, and also enable an Elasticsearch instance for logging.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_EMRFS_DYNAMO_TABLE

Name of the DynamoDB table for EMRFS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_EMRFS_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/emrfs

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_EMR_PROPERTIES

EMR configuration properties to use when creating a new EMR cluster. Properties are organized into different classifications, which map roughly to hadoop site files. For more information, see https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html. Note that when overriding the default value, the config variable TAMR_EMRFS_DYNAMO_TABLE is ignored, and fs.s3.consistent.metadata.tableName must be set explicitly if using consistent view.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrEmrProperties
  • secure: False

TAMR_ENRICHMENT_TENANT_ID

The tenant ID used to identify the account using Enrichment Services.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ENRICHMENT_URL

The URL used to access the Enrichment Management Service.

DEFAULT VALUE

https://enrich.datamastering.site/v1/enrich-mgr/

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ENRICHMENT_USE_INTERNAL_AUTH

True if internal authorization should be use for enrichment.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ENRICHMENT_CONNECT_TIMEOUT_MS

Timeout in milliseconds for connecting to enrichment SaaS endpoints

DEFAULT VALUE

5000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ENRICHMENT_READ_TIMEOUT_MS

Timeout in milliseconds for reading from enrichment SaaS endpoints

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_APIHOST

The hostname and port of the REST API endpoint of the Elasticsearch cluster to use.

DEFAULT VALUE

localhost:9200

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ES_BATCH_CONCURRENCY

The maximum number of Bulk API request to have open at any one time. Default is 1 (i.e., no concurrency).

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_BATCH_ERROR_BUDGET

The maximum tolerable fraction, between zero and one, of failed writes within a large batch. When writing a large volume of documents to Elasticsearch, it may be acceptable for some fraction of documents to get lost during indexing and not be searchable. When this is the case, configuring an error budget allows the system to expend less effort ensuring every last document is indexed and overall throughput may improve. Default is 0.0 (no errors allowed).

DEFAULT VALUE

0.0

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_BATCH_RETRY_COUNT

How many times to retry a Bulk API request when Elasticsearch rejects documents because it is overloaded. Default is 10.

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_BATCH_RETRY_WAIT

The base backoff time, in milliseconds, before retrying a Bulk API request when Elasticsearch is overloaded. If a request is retried more than once, subsequent backoff times are computed via binary exponential backoff. Default is 30ms. See https://en.wikipedia.org/wiki/Exponential_backoff

DEFAULT VALUE

30

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_BATCH_SIZE

The number of documents per Bulk API request sent. Default is 1000 documents per batch.

DEFAULT VALUE

1000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_ENABLED

Whether Tamr will index user data in Elasticsearch or not. Elasticsearch is used to power Tamr's interactive data UI, so when this is set to false Tamr will run "headless," that is, without its core UI capabilities. It can be useful to disable Elasticsearch in production settings where the models are trained on a separate instance and the goal is to maximize pipeline throughput.

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_MAX_GEOSPATIAL_FEATURES_DEFAULT

The result limit for fetching records with geospatial features to render geospatial map. This value must not exceed the value of TAMR_ES_MAX_RESULT_WINDOW.

DEFAULT VALUE

1000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_HEALTH_CHECK_METADATA

Whether Tamr should perform script type related health checks when starting Tamr. Some remote ES services do not provide metadata about the type of allowed scripts, which means health checks fail and Tamr will not start. Set this to "false" when you know that this is the case for a remote ES deployment. Tamr supports inline scripts only.

DEFAULT VALUE

True

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_HTTP_TIMEOUT

DEFAULT VALUE

0

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ES_HTTP_MAX_LINE_LENGTH

Maximum size of an HTTP URL

DEFAULT VALUE

100KB

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_ES_INDEX_DYNAMIC_SETTINGS

Settings to apply to Tamr's indexes in Elasticsearch. These settings are re-applied every time Tamr starts, and whenever the create index API is called. The value of this field is a yaml dictionary where keys are the Elasticsearch setting name, and values are the respective settings values. The default value is: index.number_of_replicas: {{ TAMR_ES_INDEX_NUMBER_OF_REPLICAS }} index.mapping.total_fields.limit: {{ TAMR_ES_INDEX_TOTAL_FIELDS_LIMIT }} index.max_result_window: {{ TAMR_ES_MAX_RESULT_WINDOW }} index.refresh_interval: 300s index.merge.scheduler.max_thread_count: {{ TAMR_TOTAL_CORES * 2 }} index.translog.durability: async Note that if the default value is overridden, then TAMR_ES_INDEX_NUMBER_OF_REPLICAS, TAMR_ES_INDEX_TOTAL_FIELDS_LIMIT and TAMR_ES_MAX_RESULT_WINDOW will be ignored.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrEsIndexDynamicSettings
  • secure: False

TAMR_ES_INDEX_NUMBER_OF_REPLICAS

DEFAULT VALUE

0

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_INDEX_NUMBER_OF_SHARDS

The number of shards to set when creating the Tamr index in Elasticsearch. Default value is the number of cores on the local host machine, so this should be overridden when using a remote Elasticsearch cluster. Note: this value is only applied when the index is created.

DEFAULT VALUE

{{ ES_NUM_SHARDS }}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_ES_INDEX_STATIC_SETTINGS

Settings to apply when creating new indexes in Elasticsearch. Use this for settings that can only be applied when an index is being created. The value of this field is a yaml dictionary where keys are the Elasticsearch setting names, and values are respective setting values. Default value sets number of shards to TAMR_ES_INDEX_NUMBER_OF_SHARDS. Note that if the default value is overridden, then TAMR_ES_INDEX_NUMBER_OF_SHARDS is ignored.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrEsIndexStaticSettings
  • secure: False

TAMR_ES_INDEX_TOTAL_FIELDS_LIMIT

DEFAULT VALUE

1000000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ES_JAVA_OPTS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.KerberosJavaArgs
  • secure: False
  • description: None

TAMR_ES_LOGGING_APIHOST

DEFAULT VALUE

localhost:{{ TAMR_ES_LOGGING_PORT }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_CLUSTER_NAME

DEFAULT VALUE

tamr-logging

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/elastic_exporter-logging.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_ES_LOGGING_EXPORTER_PORT

DEFAULT VALUE

9135

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_EXPORTER_TARGET

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/elasticsearch-logging.yml

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_EXPORTER_TARGET_TEMPLATE

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/elasticsearch-logging.yml.template

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/elasticsearch-5.6.3

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_JVM_OPTIONS_TARGET

DEFAULT VALUE

{{ TAMR_ES_LOGGING_HOME }}/config/jvm.options

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_JVM_OPTIONS_TEMPLATE

DEFAULT VALUE

{{ TAMR_ES_LOGGING_HOME }}/config/jvm.options.template

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_MEMORY

DEFAULT VALUE

1G

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_PORT

DEFAULT VALUE

9250

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_LOGGING_TRANSPORT_PORT

DEFAULT VALUE

9350

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_MAX_CLAUSE_COUNT

DEFAULT VALUE

4096

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_MAX_RESULT_WINDOW

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ES_MEM_INDEX_BUFFER_SIZE

DEFAULT VALUE

25%

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ES_NETWORK_HOST

Network host Elasticsearch binds to. Can be a specific IP address or special values like "local", "site", and "global". "local" is recommended for single node installations, but binding to a non-loopback interface is required when Spark is external. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html#network-interface-values

DEFAULT VALUE

_local_

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_ES_PASSWORD

Password to use to authenticate to Elasticsearch, using basic authentication. Not required unless the Elasticsearch cluster you're using has security and authentication enabled. The value passed in may be encrypted.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: True

TAMR_ES_SOCKET_TIMEOUT

Defines the socket timeout for Elasticsearch clients, in milliseconds. This is the timeout for waiting for data or, put differently, a maximum period of inactivity between two consecutive data packets. A timeout value of zero is interpreted as an infinite timeout. A negative value is interpreted as undefined (system default). The default value is 900000, i.e., fifteen minutes.

DEFAULT VALUE

900000

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ES_SPARK_CONNECTION_CONFIG

Used to set alternative configuration for Elasticsearch client used in Spark from those used in service code. In particular, allows customizing socket timeout, connection pool size, and keep alive for Spark. Spark primarily uses the Bulk API to index large datasets and achieves best throughput when using a relatively low socket timeout. In the service tier, however, a larger timeout is desirable in case an Elasticsearch query runs long. Hence, default timeout for Spark is 30s versus 900s for the UI. The value should be a yaml dictionary following the schema of tamr.elasticsearch.v1.ConnectionConfig defined in the file tamr.elasticsearch.proto.

DEFAULT VALUE

socketTimeoutMillis: 30000 maxConnectionsTotal: 64 maxConnectionsPerRoute: 32 keepAliveMillis: 0

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ES_SSL_ENABLED

Whether to connect to Elasticsearch over https or not. Default is false (http).

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ES_SSL_VALIDATE_CERTS

When using https to connect to Elasticsearch (i.e., TAMR_ES_SSL_ENABLED=true), defines whether to validate server certificates. Uses the JRE default trust material when enabled.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ES_USER

Username to use to authenticate to Elasticsearch. Not required unless the Elasticsearch cluster you're using has security and authentication enabled.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_FILE_LOGGING_FILTER

DEFAULT VALUE

NEUTRAL

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_FS_CONFIG_DIR

Identifies the directory that stores HDFS configuration files specified by TAMR_FS_CONFIG_URIS and TAMR_FS_EXTRA_URIS.

DEFAULT VALUE

{{ TAMR_HADOOP_HOME }}/etc/hadoop/

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_FS_CONFIG_URIS

Stores a semicolon-separated list of the URIs of the core Hadoop configuration files.

DEFAULT VALUE

{{ TAMR_HADOOP_HOME }}/etc/hadoop/core-site.xml;{{ TAMR_HADOOP_HOME }}/etc/hadoop/yarn-site.xml

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_FS_EXTRA_CONFIG

Dictionary of key:value pairs of filesystem configuration.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_FS_EXTRA_URIS

Stores a semicolon-separated list of the URIs for non-XML HDFS configuration files.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_FS_KERBEROS_ENABLED

Indicates whether the HDFS configuration uses Kerberos for authentication.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_FS_URI

Primary filesystem URI. Set to the root of the filesystem. Some examples: "file:///", "gs://tamr-bucket/", "s3://tamr-bucket/", "hdfs://tamr-nameservice/"

DEFAULT VALUE

file:///

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_GOOGLE_APPLICATION_CREDENTIALS

Path to the service account credential key file to use to connect to Google Cloud services. If this variable is set, it takes precedence over other default application credentials set with the environment variable GOOGLE_APPLICATION_CREDENTIALS or from GCE instance credentials.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: True

TAMR_GRAFANA_CONFIG

DEFAULT VALUE

{{ TAMR_GRAFANA_HOME }}/conf/tamr-grafana-config.ini

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_CONFIG_TEMPLATE

DEFAULT VALUE

{{ TAMR_GRAFANA_HOME }}/conf/tamr-grafana-config.ini.template

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_DASHBOARDS_DIR

DEFAULT VALUE

{{ TAMR_GRAFANA_HOME }}/dashboards

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_DATA_DIR

DEFAULT VALUE

{{ TAMR_GRAFANA_HOME }}/data

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_ENABLED

Set to 'true' to use Grafana for metrics dashboards, and also enable Prometheus, which collects the metrics used by Grafana.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_GRAFANA_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/grafana-6.3.4

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_PLUGIN_DIR

DEFAULT VALUE

{{ TAMR_GRAFANA_DATA_DIR }}/plugins

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_PORT

DEFAULT VALUE

31101

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAFANA_URL_BASE_PATH

Set this if you need grafana to be served from another path; something other than "/".

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_GRAPHITE_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/graphite_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAPHITE_EXPORTER_PORT

DEFAULT VALUE

31108

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAPHITE_EXPORTER_TARGET

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/graphite.yml

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAPHITE_EXPORTER_TARGET_TEMPLATE

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/graphite.yml.template

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAPHITE_PORT

DEFAULT VALUE

31109

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GRAPHITE_HOST

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_GUIDED_WALKTHROUGH_ENABLE

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HADOOP_HOME

The root of the Hadoop directory

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/hadoop-2.7.3/

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_HARDKILL_TIMEOUT_SECONDS

The time period Tamr waits after attempting to gracefully shut down processes for its dependent components, such as HBase, YARN, or Zookeeper. The default is 10 seconds. You don't need to change this parameter. After this time period, if the processes had not stopped successfully, Tamr terminates any and all processes cleanly. Do not use any other scripts to shut down Tamr-dependent components. Instead, use stop-dependecies.sh, as it relies on this parameter to stop all processes reliably.

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_HBASE_BATCH_SIZE

During batch writes to HBase, this is the number of rows Tamr will buffer before passing them to the HBase client. With HBase (but not Bigtable), the actual buffer will be larger if TAMR_HBASE_SORT_BATCHES is set to a value greater than one, but rows will still be passed to the client in batches of this size.

DEFAULT VALUE

1000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_CHMOD_ENABLED

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_COMPRESSION

Identifies what algorithms to use in HBase for the compression and decompression of data.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrHbaseCompression
  • secure: False

TAMR_HBASE_CONFIG_URIS

Connection info for the HBase client to directly connect to the HBase instance(s). Supports file://, zk://, and http:// URI schemes.

DEFAULT VALUE

{{ TAMR_HBASE_HOME }}/conf/hbase-site.xml

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_DATA_DIR

DEFAULT VALUE

file://{{ TAMR_UNIFY_HOME }}/tamr/hbase-data

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting', 'unify']
  • secure: False
  • description: None

TAMR_HBASE_EXPORTER_ENABLED

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_EXTRA_CONFIG

DEFAULT VALUE

{}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_EXTRA_URIS

Supporting non-xml (e.g. hbase-env.sh) files to be colocated with the files defined by TAMR_HBASE_CONFIG_URIS.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_GC_INTERVAL_BETWEEN_GC_IN_SECONDS

DEFAULT VALUE

86400

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_GC_MAX_CONCURRENT_TABLES

DEFAULT VALUE

3

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_GC_PERFORM_MAJOR_COMPACTION

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/hbase-1.3.1

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_HSTORE_BLOCKING_STORE_FILES

DEFAULT VALUE

200

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_HREGION_MEMSTORE_FLUSH_SIZE

DEFAULT VALUE

536870912

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_HREGION_MEMSTORE_BLOCK_MULTIPLIER

DEFAULT VALUE

8

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_KERBEROS_ENABLED

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_KEY_SPACE

(experimental) The key space controls how Tamr maps records in Tamr datasets to rows in HBase. Record IDs can be "salted," i.e., prepended with a hash of their value, and "prefixed," i.e., prepended with a constant value. The allowed values are: SALT - use a salt only: <record_id> PREFIX - use a prefix only: <record_id> SALT_AND_PREFIX - add a prefix, then salt: <record_id> PREFIX_AND_SALT - salt, then add a prefix: <record_id> Note that when using TAMR_HBASE_STORAGE_MODE=SHARED, the key space MUST include a prefix (otherwise, Tamr will default to SALT_AND_PREFIX). The default value depends on TAMR_HBASE_STORAGE_MODE as follows: TAMR_HBASE_STORAGE_MODE Default key space DEDICATED SALT SHARED SALT_AND_PREFIX

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_MASTER_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/hbase_master_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_MASTER_EXPORTER_PORT

DEFAULT VALUE

9113

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_MASTER_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_MASTER_JMX_PORT

DEFAULT VALUE

60010

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_MASTER_MEM

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrHbaseMasterMem
  • secure: False
  • description: None

TAMR_HBASE_NAMESPACE

DEFAULT VALUE

tamr

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_TABLE_CREATION_THREADS

Number of threads to use when creating multiple HBase tables

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_NUMBER_OF_REGIONS

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_NUMBER_OF_SALT_VALUES

The number of distinct salt values to be used for prefixing row keys in HBase tables.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrHBaseNumberOfSaltValues
  • secure: False

TAMR_HBASE_OFFPEAK_END_HOUR

When to stop using off-peak compaction settings in server local time (usually GMT), expressed as an integer between 0 and 23.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_HBASE_OFFPEAK_START_HOUR

When to begin using off-peak compaction settings in server local time (usually GMT), expressed as an integer between 0 and 23.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_HBASE_PRODUCER_THREADS

HBase scans are parallelized with this many threads.

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_HBASE_REGIONSERVER_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/hbase_regionserver_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_REGIONSERVER_EXPORTER_PORT

DEFAULT VALUE

9114

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_REGIONSERVER_JMX_PORT

DEFAULT VALUE

60030

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_REGION_SERVER_MEM

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrHbaseRegionServerMem
  • secure: False
  • description: None

TAMR_HBASE_REMOTE_DOWNLOAD_ENABLED

Whether to download HBase configuration files from a remote filesystem.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_REPLICATION_FACTOR

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_ROLLBACK_MAX_CONCURRENCY

DEFAULT VALUE

3

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_ROLLBACK_POLLER_DELAY_BETWEEN_RUNS_SECONDS

DEFAULT VALUE

60

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_SORT_BATCHES_ENABLED

HBase only; with Bigtable this is always true. If set to true, then batch writes to HBase will be sorted before being written, unless TAMR_HBASE_SORT_BATCHES_BUFFER_FACTOR is set to a value less that one. Default is false.

DEFAULT VALUE

{{ TAMR_REMOTE_HBASE_ENABLED }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_SORT_BATCHES_BUFFER_FACTOR

HBase only; with Bigtable this is always 1. When sorting is enabled, this defines the number of batches that will be buffered and sorted at a time. For example, If TAMR_HBASE_BATCH_SIZE is 1000, and TAMR_HBASE_SORT_BATCHES_BUFFER_FACTOR is 10, then every 10,000 rows will be buffered and sorted together before writing. If this value is zero or less, then sorting is disabled. The default is 100. Note this value is ignored unless TAMR_HBASE_SORT_BATCHES_ENABLED is true.

DEFAULT VALUE

100

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_STORAGE_MODE

(experimental) The storage mode controls how the system will store and access datasets in HBase. The supported modes are DEDICATED (default) and SHARED. In DEDICATED mode, each Tamr dataset is stored in its own dedicated HBase table. In SHARED mode, all Tamr datasets are stored in a single shared HBase table. This configuration variable is only effective when new datasets are created; changing it will not affect how pre-existing datasets are stored/accessed.

DEFAULT VALUE

DEDICATED

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HBASE_ZK_CLIENT_PORT

DEFAULT VALUE

2181

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HBASE_ZK_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/hbase-data

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_ZK_QUORUM

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify', 'supporting']
  • secure: False
  • description: None

TAMR_HBASE_ZK_SESSION_TIMEOUT

DEFAULT VALUE

240000

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_HBASE_ZK_TICKTIME

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrZkTickTime
  • secure: False
  • description: None

TAMR_HBASE_ZK_ROOT_NODE

DEFAULT VALUE

/hbase

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_HTTP_CLIENT_CONNECTION_TIMEOUT

The maximum time to wait for a http client connection to open. Be careful setting this value as it will affect ALL services.

DEFAULT VALUE

120s

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HTTP_CLIENT_TIMEOUT

The maximum idle time for a connection, once established. Be careful setting this value as it will affect ALL services.

DEFAULT VALUE

450s

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HTTP_MAX_HEADER_SIZE

The maximum size of a request header. Larger headers will allow for more and/or larger cookies plus larger form content encoded in a URL. However, larger headers consume more memory and can make a server more vulnerable to denial of service attacks.

DEFAULT VALUE

100Kib

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_HTTP_MAX_IDLE_TIMEOUT

The maximum duration of time to wait for a new message to be received or sent on a HTTP connection. This value is interpreted as the maximum time between some progress being made on the connection. So if a single byte is read or written, then the timeout is reset. Be careful setting this value as it will affect ALL services.

DEFAULT VALUE

30s

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JAVA_EXECUTABLE

DEFAULT VALUE

{{ TAMR_JAVA_HOME }}/bin/java

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting', 'unify']
  • secure: False
  • description: None

TAMR_JAVA_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/openjdk-8u222/

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_JOB_DATAPROC_CLUSTER_CONFIG

Yaml-valued cluster configuration to pass to Dataproc when creating ephemeral managed clusters to run Spark jobs. Only used when TAMR_JOB_SPARK_CLUSTER is set to "dataproc-ephemeral". For details, see the Google Cloud docs: https://cloud.google.com/dataproc/docs/reference/rest/v1/ClusterConfig

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_JOB_DATAPROC_CLUSTER_NAME

When running jobs on a provisioned Dataproc cluster (i.e., TAMR_JOB_SPARK_CLUSTER = "dataproc"), this is the name of the cluster to use. When running with ephemeral managed Dataproc clusters, (TAMR_JOB_SPARK_CLUSTER = "dataproc-ephemeral"), this is the prefix of the (randomized) names to be given to the ephemeral clusters when they are created.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATAPROC_PROJECT_ID

The GCP project ID in which to use Dataproc.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATAPROC_REGION

The GCP region in which to use Dataproc.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_EMR_CLUSTER_ID

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_JOB_DATABRICKS_HOST

The Databricks host URL.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_TOKEN

The Databricks access token for the host.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_JOB_DATABRICKS_WORKINGSPACE

DBFS directory in which data will be written for Databricks jobs.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_MIN_WORKERS

Minimum number of workers for a Databricks cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_MAX_WORKERS

Maximum number of workers for a Databricks cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_SPARK_VERSION

Databricks runtime version.

DEFAULT VALUE

6.4.x-scala2.11

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_NODE_TYPE

Worker type for use in Databricks cluster (e.g. Standard_DS12_v2).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_ENABLE_DBFS_FILESIZE_CHECK

Whether to verify that the filesize of file uploaded to DBFS matches original file.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_USER_AGENT

User Agent to set in Databricks request header

DEFAULT VALUE

tamr/cloud-native

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_SOCKET_TIMEOUT

Http-Client socket timeout, in milliseconds

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_CONNECTION_TIMEOUT

Http-Client connection timeout, in milliseconds

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_CONNECTION_REQUEST_TIMEOUT

Http-Client connection request timeout, in milliseconds

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_MAX_RETRY

Http-Client maximum retries

DEFAULT VALUE

3

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_REQUEST_SENT_RETRY_ENABLED

True when Http-Client retry is enabled

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_DATABRICKS_RETRY_INTERVAL

Http-Client retry interval, in milliseconds

DEFAULT VALUE

10000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPALK_CLUSTER_NAME

Name of the Spark cluster resource to target when making calls to Spalk. Required when using Spalk for job submission.

DEFAULT VALUE

null

PROPERTIES

  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPALK_TLS_ENABLE

Whether to use transport layer security when making RPC calls to Spalk. Default is false, i.e., requests are sent in plain text.

DEFAULT VALUE

False

PROPERTIES

  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_AUX_JAR

DEFAULT VALUE

[{{ TAMR_JOB_SPARK_POSTGRES_JAR }}]

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_JOB_SPARK_CLUSTER

The Spark master URL of the Spark cluster to use to execute jobs.

DEFAULT VALUE

yarn

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_CONFIG_OVERRIDES

A list of named sets of Spark config overrides.

DEFAULT VALUE

[]

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_DEFAULT_CONFIG_NAME

The name of the default set of Spark config.

DEFAULT VALUE

default

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_DRIVER_MEM

The amount of memory allocated to the Spark driver. As a rule, TAMR_SPARK MEMORY should be set to approximately 10% larger than TAMR_SPARK_DRIVER_MEM + (TAMR_JOB_SPARK_EXECUTOR_MEM * TAMR_JOB_SPARK_EXECUTOR_INSTANCES).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrJobSparkDriverMem
  • secure: False

TAMR_JOB_SPARK_ENV

A map of extra environment variables to set on Spark processes. Note that depending on the target Spark service, some environment variables are set by the system by default. Values added here are merged with those defaults and will take precedence in case of key conflict. Default is empty.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_EVENT_LOGS_DIR

The location to store the Spark event logs.

DEFAULT VALUE

{{ TAMR_UNIFY_DATA_DIR }}/job/sparkEventLogs

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_EXECUTOR_CORES

The number of cores assigned to each Spark executor process.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrJobSparkExecutorCores
  • secure: False

TAMR_JOB_SPARK_EXECUTOR_DOCKER_IMAGE

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_JOB_SPARK_EXECUTOR_INSTANCES

The number of Spark executors to run.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrJobSparkExecutorInstances
  • secure: False

TAMR_JOB_SPARK_EXECUTOR_MEM

The amount of memory allocated to each Spark executor. As a rule, TAMR_SPARK_MEMORY should be set to approximately 10% larger than TAMR_SPARK_DRIVER_MEM + (TAMR_JOB_SPARK_EXECUTOR_MEM * TAMR_JOB_SPARK_EXECUTOR_INSTANCES). If not set, Tamr calculates a default from TAMR_SPARK_MEMORY, TAMR_JOB_SPARK_DRIVER_MEM, and TAMR_SPARK_CORES (which is calculated from TAMR_TOTAL_CORES and TAMR_TOTAL_MEMORY).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrJobSparkExecutorMem
  • secure: False

TAMR_JOB_SPARK_JAR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/libs/unifySpark.jar

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_JOB_SPARK_LOCAL_YARN_JARS

A list, separated by semicolons (;), of glob-supported paths to jars that will be used by the Yarn cluster with a local filesystem.

DEFAULT VALUE

{{ TAMR_SPARK_HOME }}/jars/*

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_LOG4J_PROPS

Path to the jinja2 template file for the log4j.properties file to render and send to Spark to configure logging from Spark drivers and executors. If not specified or invalid, the system will use a default template from a built-in resource file.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/conf/log4j.properties.j2

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_POSTGRES_JAR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/libs/postgresql-42.1.4.jar

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_BIGQUERY_ENABLED

Whether to enable support for datasets backed by Google BigQuery. Note that support for Big Query is conditional on having Google Cloud default application credentials present in the local host environment. If you are running Tamr on a GCP host, but are using a remote Spark cluster that is not in GCP, you may need to disable this setting to avoid authentication errors in Spark jobs.

DEFAULT VALUE

True

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_BIGQUERY_TEMP_GCS_BUCKET

For use with BigQuery datasets. Optional. If set, then BigQuery operations that use temporary storage in Google Cloud Storage (GCS) will use the specified bucket. If unset, the system will attempt to use the staging bucket associated with Spark jobs. If no such bucket is available, BigQuery operations may fail.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_BIGQUERY_JAR

Location of the BigQuery Spark connector jar.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/libs/spark-bigquery.jar

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_JOB_SPARK_PROPS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_BIGQUERY_PROJECTS_INCLUDE

A list of projects for the Big Query driver to include.

DEFAULT VALUE

[]

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_SUBMIT_TIMEOUT_SECONDS

The timeout period in seconds for Spark submitters.

DEFAULT VALUE

900

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_JOB_SPARK_YARN_QUEUE

Name of the Yarn queue that Spark jobs will be submitted to.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_KERBEROS_KEYTAB

When TAMR_FS_KERBEROS_ENABLED is set to true, identifies the path to the Kerberos keytab file.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_KERBEROS_KRB5

When TAMR_FS_KERBEROS_ENABLED is set to true, identifies the path to the Kerberos krb5.conf file.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_KERBEROS_PRINCIPAL

When TAMR_FS_KERBEROS_ENABLED is set to true, identifies the principal to use in the keytab file defined by TAMR_KERBEROS_KEYTAB.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_KIBANA_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/kibana-{{ TAMR_KIBANA_VERSION }}-linux-x86_64

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_KIBANA_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/kibana.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_KIBANA_MEMORY

DEFAULT VALUE

1024

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_KIBANA_PORT

DEFAULT VALUE

5601

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_KIBANA_VERSION

DEFAULT VALUE

5.6.16

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_LAUNCHER_INTERVAL

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_LAUNCHER_JITTER

DEFAULT VALUE

20

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_LAUNCHER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/launcher.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_LAUNCHER_SCRIPT_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/launcher.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_LD_LIBRARY_PATH

A list of directories, separated by colons (:), in which the dynamic linker must search for the execution libraries before starting up Tamr processes. Listed directories are appended to the LD_LIBRARY_PATH environment variable and Tamr uses them when starting up. For more information, see the man page for ld.so linker in Linux.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting', 'unify']
  • secure: False

TAMR_LD_PRELOAD

A list of paths to execution libraries, separated by colons (:), in which the dynamic linker must search for the execution libraries before loading up other shared libraries. Listed libraries are appended to the LD_PRELOAD environment variable and Tamr uses them when starting up. For more information, see the man page for ld.so linker in Linux. Note: Unlike LD_LIBRARY_PATH, LD_PRELOAD takes in a path to the library/object, not the containing directory.

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting', 'unify']
  • secure: False

TAMR_LICENSE_KEY

Key provided by Tamr to allow you to log into the system. Please contact your Tamr representative if you require an updated key.

DEFAULT VALUE

sib1Q+ZrXnDj8Trc0FLVqi2qjvQIPuls0QgrofF6waXzL8LqUOSVB/5bgMaIw/saqOOSZBjuP3gMOqh+y4TDil7koE9jCnPgVVlWSVNVupwVQ6Uf+XU3Wra2Ji9ffEZwtFgtEe1cxeEUnkCqYc6ov0PTerTOkUc7ItAclKryJyc3W1GPtiYo2IqYnx3Z0LJa0AE8NvIbKYRin+XqaHaGzYpU2I+eNdXnWIgW1cB4GN57xyt8zYmMU65OwF2mx+SNDF31JRw1uh0007hnOrKaMyyhMyLPNb8MO0C0K5DP2zW/iyCgxAhCcxuAIIJ1LYaOHtf5VLkwcAeCdAdxU4BuxZIsAvdoecplARN+VVUnsJVYvtK5hSH3yJiNxG9QhtLKnUUo8yYw8y7IabvcbkGhEUU7xsxJ1t+TVkjKtLWGADzNrd3r0tQMvTdATvjevNoyiUohaDlIQWH0Tu07yWnxvuXXS6KoemuwqoatWEL4S78D3xdMMl9yhYLTfn//kVQRuZ1hByYCCQjdp7i3YeCXGm94QGIqd0kJcgkGqoSDRjIEjMkzOVqFmOz/+ISg4SS9zYRzP1tX3ilHccaz5Wgxk8QxJ0uPDJEso8Z8xUJXuXLTpBdNISLyPAN4Ipp4OWwKdpVySJFUXWR2HRLSQKOlx5TUeA6cHwz60Fx9TfrF4iw=

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_LIVE_HELP_TRACKING_ENABLE

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_LLM_BATCH_SIZE

Defines the expected largest batch size for records to match in a mastering project. Adjusting this value affects both latency and throughput.

DEFAULT VALUE

1000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_LLM_REFRESH_INTERVAL_IN_MILLISECONDS

DEFAULT VALUE

2000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_LLM_TOPK

Defines the top K number of similar clusters and records for the low-latency match (LLM) service to return.

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_LOCAL_IDENT_STRING

The identity string used to name local instances of Yarn/HBase. Yarn/HBase will use this string to name their PID files in order to avoid filename collisions.

DEFAULT VALUE

tamr-local

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_LOG_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/logs

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_LOG_JSON_ENABLED

Whether to format logs as JSON objects (true) or text (false). Default is text (false).

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_LOG_LAYOUT_ACCESS

A yaml dictionary defining the layout of JSON formatted access logs. The allowed configuration is documented in io.dropwizard.logging.json.AccessJsonLayoutBaseFactory. Ordinarily, this should not be set directly. Instead, the value is computed based on other config variables.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrLogLayout$Access
  • secure: False

TAMR_LOG_LAYOUT_EVENT

A yaml dictionary defining the layout of JSON formatted event logs. The allowed configuration is documented in io.dropwizard.logging.json.EventJsonLayoutBaseFactory. Ordinarily, this should not be set directly. Instead, the value is computed based on other config variables.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrLogLayout$Event
  • secure: False

TAMR_LOG_RETENTION_DAYS

DEFAULT VALUE

30

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MATCH_ADMIN_BIND_PORT

DEFAULT VALUE

9171

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MATCH_ADMIN_PORT

DEFAULT VALUE

9171

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_MATCH_BIND_PORT

DEFAULT VALUE

9170

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MATCH_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MATCH_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MATCH_PORT

DEFAULT VALUE

9170

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_MICROMETER_CONFIG

Configuration, in yaml format, for Tamr's metric reporting through Micrometer.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: True

TAMR_MICROMETER_SPARK_CONFIG

Configuration, in yaml format, for Tamr's Spark metric reporting through Micrometer. Note that only whitelist filters are allowed here (no blacklists). Also Prometheus isn't supported here.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: True

TAMR_MULTILOG_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/daemontools-encore-multilog-1.10

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_NODE_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/node_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_NODE_EXPORTER_PORT

DEFAULT VALUE

9011

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_OPEN_BROWSER

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_OS_HEADROOM

The amount of OS memory that is a headroom. It should be greater than 15% of allocatable system memory. It gets set to 16% of the total memory by default. If set too low, the OS will work aggressively to free memory from your processes to maintain its pools which can result in slower performance. The pool of memory Tamr allocates is TAMR_TOTAL_MEMORY minus TAMR_OS_HEADROOM. As a guideline, the upgrade script healthcheck produces a warning if headroom is less than 15%.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrOsHeadroom
  • secure: False

TAMR_PAIR_CONFIDENCE_THRESHOLD_HIGH

DEFAULT VALUE

0.7

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PAIR_CONFIDENCE_THRESHOLD_MEDIUM

DEFAULT VALUE

0.25

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_ADMIN_BIND_PORT

DEFAULT VALUE

9081

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_ADMIN_PORT

DEFAULT VALUE

9081

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PERSISTENCE_BIND_PORT

DEFAULT VALUE

9080

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_DB_CONNECTION_POOL_MAX

Maximum size of connection pool

DEFAULT VALUE

32

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_PERSISTENCE_DB_CONNECTION_POOL_MIN

Minimum (and initial) size of connection pool

DEFAULT VALUE

8

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_PERSISTENCE_DB_DRIVER

DEFAULT VALUE

org.postgresql.Driver

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_DB_NAME

DEFAULT VALUE

doit

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PERSISTENCE_DB_PASS

Stores the encrypted password for the PostgreSQL service.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: True

TAMR_PERSISTENCE_DB_PORT

DEFAULT VALUE

5432

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PERSISTENCE_DB_URL

For the PostgreSQL service, identifies the URL of the database to use with the JDBC driver.

DEFAULT VALUE

jdbc:postgresql://{{ TAMR_POSTGRES_HOSTNAME }}:{{ TAMR_PERSISTENCE_DB_PORT}}/{{ TAMR_PERSISTENCE_DB_NAME }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_PERSISTENCE_DB_USER

Identifies the username for the PostgreSQL service.

DEFAULT VALUE

tamr

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_PERSISTENCE_EXPORTER_PASS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: True
  • description: None

TAMR_PERSISTENCE_EXPORTER_USER

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: True
  • description: None

TAMR_PERSISTENCE_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_IN_FILTER_MAX_ELEMENTS

The maximum number of elements to include in an 'IN' filter before breaking up the query.

DEFAULT VALUE

32000

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_PERSISTENCE_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PERSISTENCE_PORT

DEFAULT VALUE

9080

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PG_DUMP_BINARY

Identifies the Postgres backup binary.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.PGDumpBinary
  • secure: False

TAMR_PG_RESTORE_BINARY

Identifies the Postgres restore binary.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.PGRestoreBinary
  • secure: False

TAMR_PID_DEPS_DIR

When stopping dependencies, Tamr stops each process ID (PID) found in this directory, and then waits for TAMR_HARDKILL_TIMEOUT_SECONDS before doing a force stop of any other processes.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/pids-deps

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_PID_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/pids

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_POSTGRES_EXPORTER_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/postgres_exporter.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_POSTGRES_EXPORTER_PORT

DEFAULT VALUE

31187

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_POSTGRES_EXPORTER_TARGET

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/postgres.yml

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_POSTGRES_EXPORTER_TARGET_TEMPLATE

DEFAULT VALUE

{{ TAMR_PROMETHEUS_TARGETS_DIR }}/standard/postgres.yml.template

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_POSTGRES_HOSTNAME

DEFAULT VALUE

localhost

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PREVIEW_ADMIN_BIND_PORT

DEFAULT VALUE

9041

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_ADMIN_PORT

DEFAULT VALUE

9041

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PREVIEW_BIND_PORT

DEFAULT VALUE

9040

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_CACHE_MAX_SIZE

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_CACHE_TIMEOUT

DEFAULT VALUE

1800

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_NUM_RUNNERS

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PREVIEW_PORT

DEFAULT VALUE

9040

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PROCUREMENT_HOME

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PROMETHEUS_CONFIG

DEFAULT VALUE

{{ TAMR_PROMETHEUS_HOME }}/prometheus.yml

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_CONFIG_TEMPLATE

DEFAULT VALUE

{{ TAMR_PROMETHEUS_HOME }}/prometheus.yml.template

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_DATADIR

DEFAULT VALUE

{{ TAMR_PROMETHEUS_HOME }}/data

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/prometheus-1.5.2.linux-amd64

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_LOG_DIR

DEFAULT VALUE

{{ TAMR_LOG_DIR }}/prometheus.d

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_PORT

DEFAULT VALUE

31390

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_RULES_DIR

DEFAULT VALUE

{{ TAMR_PROMETHEUS_HOME }}/rules

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_SCRAPE_INTERVAL

DEFAULT VALUE

60s

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_STORAGE_LOCAL_RETENTION

DEFAULT VALUE

72h0m0s

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PROMETHEUS_TARGETS_DIR

DEFAULT VALUE

{{ TAMR_PROMETHEUS_HOME }}/targets

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_PUBAPI_ADMIN_BIND_PORT

DEFAULT VALUE

9181

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PUBAPI_BIND_PORT

DEFAULT VALUE

9180

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PUBAPI_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PUBAPI_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PUBAPI_NAME

DEFAULT VALUE

unified-data

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_PUBAPI_PORT

DEFAULT VALUE

9180

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_ADMIN_BIND_PORT

DEFAULT VALUE

9191

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_ADMIN_PORT

DEFAULT VALUE

9191

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_RECIPE_BIND_PORT

DEFAULT VALUE

9190

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_INIT_RETRY_INTERVAL_MS

Retry interval for recipe service initialization (in milliseconds). Defaults to 5000 (5 seconds).

DEFAULT VALUE

5000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_RECIPE_INIT_TIMEOUT_MS

Timeout for recipe service initialization (in milliseconds). Defaults to 60000 (1 minute).

DEFAULT VALUE

60000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_RECIPE_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_PORT

DEFAULT VALUE

9190

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RECIPE_GR_PREVIEW_MAX_RECORDS

The maximum number of records that can be used as input to a golden records preview request. This is the sum of the sizes of all clusters that are to be previewed. If this limit is increased, the microservice heap size (TAMR_CONGLOMERATE_MEMORY) may also need to be increased.

DEFAULT VALUE

5000

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_REMOTE_ES_ENABLED

Whether or not Elasticsearch is local (running on the same VM as the Tamr VM) or remote (running on a different one(s)).

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_REMOTE_HBASE_ENABLED

Whether HBase is local (running on the same VM as the Tamr VM) or remote (running on a different one(s)).

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_REMOTE_SPARK_ENABLED

Whether Spark is local (running on the same VM as the Tamr VM) or remote (running on a different one(s)).

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_RUNNER_CACHE_TIMEOUT

The duration of time, between accesses, a sample dataset is cached in-memory in the preview runner. Increasing this timeout increases memory usage, but also results in faster previews for queries that involve the same datasets.

DEFAULT VALUE

1 day

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_RUNNER_CORES

DEFAULT VALUE

2

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RUNNER_EXECUTOR_MEMORY

DEFAULT VALUE

1024m

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RUNNER_EXTRA_ARGS

DEFAULT VALUE

{{ TAMR_UNIFY_EXTRA_ARGS }}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_RUNNER_LONG_POLLING_FREQ_SECONDS

DEFAULT VALUE

15

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RUNNER_SPARK_PROPS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_RUNNER_WORKING_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/previewRunner

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_SAML_ATTRIBUTE_DECRYPT_KEY_PATH

Location of the public key file that decrypts the authentication response from the server, specifically the attributes. If it is empty, assume that the IdP server is sending unencrypted data. The matching private key is provided to the Identity Provider for its definition of this Service Provider in its metadata.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_AUTH_SIGNING_KEY_PATH

Location of the private key file that is used to sign auth requests to the IdP. If it is empty, assume the service will not sign its authentication request. The matching public key is provided to the Identity Provider for its definition of this Service Provider in its metadata.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_AUTH_COMPARISON_TYPE

Minimum authentication method strength required - based on the signicat.security-level value from the authentication method - when creating Authentication context.

DEFAULT VALUE

minimum

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_AUTH_CONTEXT_TYPE

The desired authentication method to use for signing in.

DEFAULT VALUE

urn:oasis:names:tc:SAML:2.0:ac:classes:Password

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_NAME_ID_POLICY

The schema type policy for NameId when getting a response from the IdP.

DEFAULT VALUE

urn:oasis:names:tc:SAML:2.0:nameid-format:transient

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_EMAIL_FIELD

The field name in the SAML auth response that represents the email of the authenticated principal.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_ENTITY_ID

The ID used to describe this Service Provider. Used by the Identity Provider to look up relevant metadata (e.g. the public key use to encrypt auth messages).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_FIRST_NAME_FIELD

The field name in the SAML auth response that represents the first name of the authenticated principal.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_GROUP_MEMBERSHIP_FIELD

The field name in the SAML auth response that represents the memberships of the authenticated principal.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_IDP_CERTIFICATE_PATH

Location of the certificate file that is used to validate the response sent back from the IdP. If it is empty, assume that the IdP server is sending unsigned data. The matching private key is part of the IdP's metadata.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_LAST_NAME_FIELD

The field name in the SAML auth response that represents the last name of the authenticated principal.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_PRINCIPAL_FIELD

The field name in the SAML auth response that represents the authenticated principal. If specified, will override the default which is to get the principal from the NameID.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SAML_SSO_LOCATION

The URL of the Identity Provider that the user is directed to in order to initiate single sign on.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SERVICES_START_UP_NUMBER_OF_RETRIES

Number of retries to connect to Tamr microservices upon startup. Defaults to 50 times.

DEFAULT VALUE

50

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SERVICES_START_UP_RETRY_INTERVAL_SECONDS

The interval between retries to connect to Tamr microservices upon startup. Defaults to 6 seconds.

DEFAULT VALUE

6

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_SPARK_BROADCAST_ROW_LIMIT

DEFAULT VALUE

300000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_SPARK_BROADCAST_SIZE_LIMIT_BYTES

DEFAULT VALUE

134217728

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_SPARK_CORES

The total number of cores to use for the Spark cluster.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrSparkCores
  • secure: False

TAMR_SPARK_HOME

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/spark-2.4.5-bin-hadoop2.7/

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_SPARK_LOGS

The directory to use for Spark log files.

DEFAULT VALUE

{{ TAMR_SPARK_HOME }}/logs

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False

TAMR_SPARK_MEMORY

The total Spark memory across all executors, only used on single-nodes so you can make sure all processes have enough memory. On remote processes, the memory is defined by the individual executor's processes. As a rule, TAMR_SPARK_MEMORY should be set to approximately 10% larger than TAMR_SPARK_DRIVER_MEM + (TAMR_JOB_SPARK_EXECUTOR_MEM * TAMR_JOB_SPARK_EXECUTOR_INSTANCES).

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrSparkMemory
  • secure: True

TAMR_SPARK_PIPELINE_DRIVER_THREADS

Experimental. Sets the level of parallelism when Tamr is executing jobs in Spark. A value greater than 1 will cause nodes in the job graph to execute concurrently, subject to order constraints (each node must wait until its dependencies are complete) and to the number of available execution threads. Default value is 1, i.e., serial execution.

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_SPARK_TEMP_DIR

DEFAULT VALUE

{{ TAMR_SPARK_HOME }}/temp

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_SPARK_WORKDIR

Identifies the directory to use for the Spark working directory.

DEFAULT VALUE

{{ TAMR_SPARK_HOME }}/workDir

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_SPNEGO_KEYTAB_FILE_PATH

For sites with TAMR_UNIFY_ENABLE_SPNEGO set to 'true', used for SSO authentication through SPNEGO.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_SPNEGO_SERVICE_PRINCIPAL

For sites with TAMR_UNIFY_ENABLE_SPNEGO set to 'true', used for SSO authentication through SPNEGO.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_STORAGE_PROVIDERS

Stores JSON keys for name, description, and provider type to identify external storage providers to use in addition to the Tamr primary storage space.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_SYSTEM_PASSWORD

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_SYSTEM_USERNAME

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True
  • description: None

TAMR_TAXONOMY_ADMIN_BIND_PORT

DEFAULT VALUE

9401

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TAXONOMY_ADMIN_PORT

DEFAULT VALUE

9401

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_TAXONOMY_BIND_PORT

DEFAULT VALUE

9400

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TAXONOMY_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TAXONOMY_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TAXONOMY_PORT

DEFAULT VALUE

9400

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TILE_SERVERS

DEFAULT VALUE

[]

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • description: None
  • secure: False

TAMR_TMP_DIR

A temporary directory used by Unify processes.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tmp

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify', 'supporting']
  • secure: False

TAMR_TMP_DIR_DAYS

Files in the temp directory older than the specified number of days are cleaned up, based on last modification date.

DEFAULT VALUE

3

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify', 'supporting']
  • secure: False

TAMR_TOTAL_CORES

The number of CPU processing cores available for Tamr and its dependencies to use. If no value is specified and total cores is calculated by the supplied formula, Tamr automatically recalculates the number of cores on restart.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrTotalCores
  • secure: False

TAMR_TOTAL_MEMORY

The total memory available on the system. It is used to indicate how much of the system memory should be given to Tamr and all of its third-party dependencies. The value is expressed in terms of numbers only, and the measurement unit is kilobytes. The pool of memory Tamr allocates is TAMR_TOTAL_MEMORY minus TAMR_OS_HEADROOM. If no value is specified and total memory is calculated by the supplied formula, Tamr automatically recalculates total memory on restart.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrTotalMemory
  • secure: False

TAMR_TRACKING_ENABLE

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRACKING_INSTANCE_NAME

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRANSFORM_ADMIN_BIND_PORT

DEFAULT VALUE

9161

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRANSFORM_ADMIN_PORT

DEFAULT VALUE

9161

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_TRANSFORM_BIND_PORT

DEFAULT VALUE

9160

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRANSFORM_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRANSFORM_LOG_LEVEL

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_TRANSFORM_PORT

DEFAULT VALUE

9160

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_ADMIN_BIND_PORT

DEFAULT VALUE

9101

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_ADMIN_PORT

DEFAULT VALUE

9101

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_UNIFY_BACKUP_AWS_ROLE_BASED_ACCESS

Set to true if Tamr should use EC2 instance profile (role-based) credentials instead of static credentials.

DEFAULT VALUE

False

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_BACKUP_AWS_ACCESS_KEY_ID

Stores the key ID for an AWS S3 backup location.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_UNIFY_BACKUP_AWS_SECRET_ACCESS_KEY

Stores the access key for an AWS S3 backup location.

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_UNIFY_BACKUP_ES

Defines whether or not to back up Elasticsearch.

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_BACKUP_EXTRA_CONFIG_PROPS

Accepts a comma-separated list of configuration variable names to back up. Tamr backs up user-defined settings for these variables in addition to settings for variables with machineSpecific: true.

DEFAULT VALUE

[]

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_BACKUP_HADOOP_TMP_DIR

Identifies an alternative backup temporary directory to use in place of the default, which is a temporary directory within the Tamr installation directory.

DEFAULT VALUE

{{ TAMR_TMP_DIR }}/backup

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_BACKUP_HDINSIGHT_STORAGE_ACCOUNT_KEY

Primary access key for Azure storage account backing the HDInsight HBase cluster

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_UNIFY_BACKUP_HDINSIGHT_STORAGE_ACCOUNT_NAME

Name of Azure storage account backing the HDInsight HBase cluster

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: True

TAMR_UNIFY_BACKUP_NUM_THREADS

DEFAULT VALUE

1

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_BACKUP_URI

Identifies the location for storing backup files.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/backups

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_BIND_PORT

Identifies the default access HTTP port.

DEFAULT VALUE

9100

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_DATA_DIR

Identifies a readable/writable path in HDFS for Tamr to read/write data.

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}/tamr/unify-data

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_ENABLE_HTTPS

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_ENABLE_SAML

Set to 'true' for sites that require SAML 2.0 for web-based, cross-domain single sign-on access.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_ENABLE_SPNEGO

Set to 'true' for sites using SSO authentication through SPNEGO.

DEFAULT VALUE

false

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_EXTRA_ARGS

DEFAULT VALUE

``

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_UNIFY_HOSTNAME

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_HTTPS_KEYSTORE_PASSWORD

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_HTTPS_KEYSTORE_PATH

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_HTTPS_VALIDATE_CERTS

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_HTTPS_VALIDATE_PEERS

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_UNIFY_KERBEROS_KRB5_JAVA_ARGS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • formula: com.tamr.procurify.admin.commands.config.formulas.KerberosJavaArgs
  • secure: False
  • description: None

TAMR_UNIFY_LOG_LEVEL

Sets the log level for "unify", the microservice for the front-end user interface of Tamr.

DEFAULT VALUE

INFO

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False

TAMR_UNIFY_PORT

Identifies the default access HTTP port.

DEFAULT VALUE

9100

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • secure: False

TAMR_ROOT_DIR

DEFAULT VALUE

{{ TAMR_UNIFY_HOME }}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_YARN_LOG_DIR

The location for Yarn resource manager and node manager logs.

DEFAULT VALUE

{{ TAMR_LOG_DIR }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_YARN_NODE_MANAGER_HOST

The hostname of the Spark YARN NodeManager.

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_YARN_NODE_MANAGER_PORT

The port of the Spark YARN NodeManager.

DEFAULT VALUE

8042

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_YARN_RESOURCE_MANAGER_ADMIN_PORT

DEFAULT VALUE

8033

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_RESOURCE_MANAGER_HOST

The hostname of the Spark YARN ResourceManager.

DEFAULT VALUE

{{ HOST_IP }}

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_YARN_RESOURCE_MANAGER_PORT

DEFAULT VALUE

8032

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_RESOURCE_MANAGER_RESOURCE_TRACKER_PORT

DEFAULT VALUE

8031

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_RESOURCE_MANAGER_SCHEDULER_PORT

DEFAULT VALUE

8030

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_RESOURCE_MANAGER_WEBUI_HTTPS_PORT

DEFAULT VALUE

8090

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_RESOURCE_MANAGER_WEBUI_PORT

DEFAULT VALUE

8088

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_YARN_TEMP_DIR

A temporary directory used by Yarn processes.

DEFAULT VALUE

{{ TAMR_HADOOP_HOME }}/temp

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_YARN_SCHEDULER_CAPACITY_MAXIMUM_AM_RESOURCE_PERCENT

Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

DEFAULT VALUE

1.0

PROPERTIES

  • machineSpecific: True
  • dependencies: ['supporting']
  • secure: False

TAMR_ZK_BOOTSTRAPPED

DEFAULT VALUE

true

PROPERTIES

  • machineSpecific: False
  • dependencies: ['supporting']
  • secure: False
  • description: None

TAMR_ZK_CONFIG_URI

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrZkConfigUri
  • secure: False
  • description: None

TAMR_ZK_CONFIG_URI_AUTHORITY

DEFAULT VALUE

{{ TAMR_ZK_SERVERS }}

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_CONFIG_URI_PATH

DEFAULT VALUE

/{{ TAMR_ZK_NAMESPACE }}/conf

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_CONNECT_TIMEOUT

DEFAULT VALUE

120000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_NAMESPACE

DEFAULT VALUE

tamr/unify001

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_RETRY_COUNT

DEFAULT VALUE

10

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_RETRY_INTERVAL

DEFAULT VALUE

500

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

TAMR_ZK_SERVERS

DEFAULT VALUE

null

PROPERTIES

  • machineSpecific: True
  • dependencies: ['unify']
  • formula: com.tamr.procurify.admin.commands.config.formulas.TamrZkServers
  • secure: False
  • description: None

TAMR_ZK_SESSION_TIMEOUT

DEFAULT VALUE

120000

PROPERTIES

  • machineSpecific: False
  • dependencies: ['unify']
  • secure: False
  • description: None

Updated 2 months ago


Configuration Variable Reference


Complete list of Tamr configuration variables.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.