User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Checkpoint

Overview

🚧

This is an advanced feature

Using checkpoint incorrectly may actually decrease your performance. If you are not experiencing performance delays, then you should not use it.

Checkpoint is a transformation that does not change the content of your data, but rather changes how your scripts interact with Spark.

If you are experiencing slow performance, checkpoint can enhance your performance by breaking down the series of transformations into manageable sized chunks. Instead of asking Tamr's transformation service to remember all of the transformations you are trying to complete at once, checkpoint tells it to work on a set of transformations at once, and then cache the results before moving on to the next set of transformations.

If you are experiencing performance delays, place a checkpoint between logical chunks of work. Choosing when and where to write checkpoints takes experience and experimentation. To write a checkpoint:

Checkpoint;