HomeTamr Core GuidesTamr Core API Reference
Tamr Core GuidesTamr Core API ReferenceTamr Core TutorialsEnrichment API ReferenceSupport Help CenterLog In

Spark Environment Configuration

Configure the Spark environment in Tamr.

Configuring Spark Environment

You can configure the following Spark environment variables:

Configuration Variable

Description

TAMR_SPARK_MEMORY

The total memory to use for the Spark cluster. For information on calculating this value, see YARN Cluster Manager Jobs.

TAMR_SPARK_CORES

The total number of cores to use for the Spark cluster.

TAMR_JOB_SPARK_CLUSTER

The full URL of the Spark cluster being used. The default value is yarn.

TAMR_JOB_SPARK_CONFIG_OVERRIDES

A list of named sets of Spark configuration overrides.

See the Support Help Center knowledge base for more details on the overrides.

TAMR_JOB_SPARK_DRIVER_MEM

The amount of memory a Tamr job uses for the driver process, such as 1G, 2G.

TAMR_JOB_SPARK_EVENT_LOGS_DIR

The directory for storing logs for Spark jobs.

TAMR_JOB_SPARK_EXECUTOR_MEM

The amount of memory a Tamr job uses per executor process, such as 2G, or 8G.

TAMR_JOB_SPARK_EXECUTOR_CORES

The total number of cores to use per executor process, such as 2.

TAMR_SPARK_WORKDIR

The directory to use for the Spark working directory.

TAMR_SPARK_LOGS

The directory to use for Spark log files.

TAMR_JOB_SPARK_SUBMIT_TIMEOUT_SECONDS

The timeout period (in seconds) for Spark submitters. The default is 300s.


Did this page help you?