User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Data Movement Service

The data movement service allows you to ingest and export data files between Tamr Core and your cloud storage.

Tamr recommends using Core Connect to import and export large data files between Tamr Core and your cloud storage provider.

The data movement service (DMS) is designed to facilitate large data movement jobs between your cloud storage solution and your Tamr instance. After DMS is configured, you can use this service either through the Tamr user interface (UI) or through the DMS API.

DMS supports CSV and Parquet formats for Tamr dataset ingest and export. Tamr supports ingesting and exporting datasets from cloud storage within your cloud provider.

If DMS is enabled for your instance, users cannot download datasets to their local file system via the UI. This allows organizations to ensure all teams follow the appropriate data access policies, which are managed via their cloud storage accounts.

important Important: Tamr Core users who need access to data files exported to cloud storage must be given access to the appropriate cloud storage locations.

Before You Use DMS

  • The current version of DMS supports API interaction through command-line utilities, including cURL, only.
  • DMS does not support Parquet files that include arrays with nulls.
  • For DMS jobs, the job ID is a GUID created by DMS and uses a different format than the numeric job
    IDs created by Tamr.
  • For successfully completed DMS jobs, the status is completed, instead of succeeded which is reported for other jobs. See Managing Jobs.

Parquet File Support

Tamr is able to ingest all Parquet files, including complex Parquet files with lists, maps, and structs.

Single level lists will appear as they are defined, with null values appearing as primitive nulls (as opposed to string nulls). When exporting Parquet files, nulls are excluded completely as defined in the Parquet specification.

Maps, structs, and lists nested deeper than two levels are partially supported; the column type is string, and the struct is converted to a string.

Note: If using the DMS API, set the inheritSchema option to false to convert all primitive types to string.

Below is an example of a complex struct converted to a string.

1312

Complex struct converted to string.

Configuring and Using DMS

To configure and use DMS, see: