Backup

If a backup is already in progress, an error is returned to the client. Otherwise, the backup service creates the backup directory (based on the current timestamp, by default) and the manifest file which describes the backup.

Upon successful initiation of a backup, the service returns an identifier representing the backup. The backup proceeds asynchronously and you can use the identifier to look up the status of the backup via the Check Backup Status endpoint.

Acceptable backup locations are S3, HDFS, or a local folder.

Response Fields

Field	Description
id	The ID of the backup started. Unique across the entire deployment of Tamr.
type	The shortened ID of the backup. Use this for querying the status.
description	A description of the backup.
status	A JSON object, containing a field for start time.
created	A JSON object, containing a field for time created.
lastModified	A JSON object, containing a field for time modified.

Implementation

📘
API Properties
Request Type: Asynchronous
Implementation Details:
The system must be in read-only mode during the backup. This is achieved through a Zookeeper node which requests the system enter read-only mode. The system will reenter read/write mode once the creation of the snapshot is complete. The copying of files proceeds in the background once the system has returned to read/write mode.
Steps of the backup:

System enters read-only mode.

Postgres database, application configuration, and files in HDFS are backed up to the backup directory.

HBase snapshot is created and Elasticsearch begins to be backed up.

System reenters read/write mode.

HBase snapshot copied to backup directory, and polls Elasticsearch for completion.

Additional Details:

The Backup service will write a _SUCCEEDED file to the backup directory if successful. Otherwise, if it fails it will stop early and write a _FAILED file instead.

Response Fields

Implementation

📘API Properties

📘
API Properties