Spark Environment Configuration
Configure the Spark environment in Tamr.
Configuring Spark Environment
You can configure the following Spark environment variables:
Configuration Variable | Description |
---|---|
TAMR_SPARK_MEMORY | The total memory to use for the Spark cluster. For information on calculating this value, see YARN Cluster Manager Jobs. |
TAMR_SPARK_CORES | The total number of cores to use for the Spark cluster. |
TAMR_JOB_SPARK_CLUSTER | The full URL of the Spark cluster being used. The default value is yarn . |
TAMR_JOB_SPARK_CONFIG_OVERRIDES | A list of named sets of Spark configuration overrides. |
TAMR_JOB_SPARK_DRIVER_MEM | The amount of memory a Tamr job uses for the driver process, such as 1G , 2G . |
TAMR_JOB_SPARK_EVENT_LOGS_DIR | The directory for storing logs for Spark jobs. |
TAMR_JOB_SPARK_EXECUTOR_MEM | The amount of memory a Tamr job uses per executor process, such as 2G , or 8G . |
TAMR_JOB_SPARK_EXECUTOR_CORES | The total number of cores to use per executor process, such as 2 . |
TAMR_SPARK_WORKDIR | The directory to use for the Spark working directory. |
TAMR_SPARK_LOGS | The directory to use for Spark log files. |
TAMR_SPARK_SUBMIT_TIMEOUT | The timeout period (in seconds) for Spark submitters. The default is 300s . |
Updated almost 5 years ago
What’s Next