A Schema Mapping project allows you to map attributes from many input datasets into a set of attributes known as a single unified schema. This is the schema for the unified dataset you are building. A unified schema consolidates attrributes from disparate data sources into a set of attributes that are consistent across all of your input datasets.
A unified schema is a list of attributes associated with an entity, such as a customer, or an organization, across multiple input datasets. You can think of it as the set of column headers for the table into which Tamr consolidates data.
For example, you can use schema mapping in Tamr to converge records from thousands of clinical trials into a single standard CDSIC SDTM version. Having standardized data allows companies to comply with FDA standards. It makes it easier to implement projects, such as building integrated, curated data hubs. Data hubs with clean data enable scientific insights across many clinical trials.
The schema mapping workflow consists of the following stages:
- Add datasets.
- Unify attributes.
- Map source attributes to a unified schema.
- Generate attribute mapping recommendations.
- Optional. Tag datasets. Tagging datasets adds metadata about each dataset. This allows you to filter datasets by tags in later stages of data mastering.
- Optional. Profile a dataset. You can run profiling at any point in the workflow.