Tamr Documentation

Schema Mapping

A Schema Mapping project solves the task of mapping attributes into a single unified schema. A unified schema allows disparate data sources to be consolidated into one dataset with consistent attributes.

A unified schema is a list of attributes or fields associated with an entity (customer, organization, patient, etc.) across multiple datasets. In simple terms, it is the set of column headers into which Tamr will consolidate data.

A common use case for schema mapping is the biopharmaceuticals realm. Tamr's schema mapping can be used, for example, to converge records from hundreds or thousands of clinical trials into a single standard CDSIC SDTM version. Having standardized data not only allows companies to comply with FDA standards but also allows easier implementation of projects such as building integrated, curated data hubs. Clean data is vital to driving scientific insight across many trials.

For more information, see here.

Workflow

Overall workflow is as follows:

  1. Add datasets
  2. Unify attributes
  3. Map source attributes to unified schema
  4. Generate attribute mapping recommendations

Optional steps:

  1. Tag datasets. Tagging data provides metadata about each dataset, and make it possible to filter by tag in later steps.
  2. Profile unified attributes. Profiling can be done at any point in the workflow.

What's Next

Dataset Tags