User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Spark Environment Configuration

Configure the Spark environment in Tamr.

Configuring Spark Environment

You can configure the following Spark environment variables:

Configuration VariableDescription
TAMR_SPARK_MEMORYThe total memory to use for the Spark cluster. For information on calculating this value, see YARN Cluster Manager Jobs.
TAMR_SPARK_CORESThe total number of cores to use for the Spark cluster.
TAMR_JOB_SPARK_CLUSTERThe full URL of the Spark cluster being used. The default value is yarn.
TAMR_JOB_SPARK_CONFIG_OVERRIDESA list of named sets of Spark configuration overrides.
TAMR_JOB_SPARK_DRIVER_MEMThe amount of memory a Tamr job uses for the driver process, such as 1G, 2G.
TAMR_JOB_SPARK_EVENT_LOGS_DIRThe directory for storing logs for Spark jobs.
TAMR_JOB_SPARK_EXECUTOR_MEMThe amount of memory a Tamr job uses per executor process, such as 2G, or 8G.
TAMR_JOB_SPARK_EXECUTOR_CORESThe total number of cores to use per executor process, such as 2.
TAMR_SPARK_WORKDIRThe directory to use for the Spark working directory.
TAMR_SPARK_LOGSThe directory to use for Spark log files.
TAMR_SPARK_SUBMIT_TIMEOUTThe timeout period (in seconds) for Spark submitters. The default is 300s.