The Schema Mapping tab allows you to access and review all attributes from all input datasets. You can dictate how Tamr should map them to the unified attributes.
Unified attributes are derived attributes that you create by mapping one or more attributes from your input datasets into a single atribute in the schema for the unified dataset.
- Each unified attribute represents the header for one column in the unified dataset.
- A schema is a collection of unified attributes.
- The mappings can be one-to-one or many-to-one.
- You can map more than one attribute from an input dataset to the same unified attribute.
- You can choose to ignore attributes in input datasets and not map them to any unified attributes.
To create a schema for the unified dataset, also known as the unified schema, you can:
- Bootstrap a unified schema for each table from an input dataset. Bootstrapping performs these steps:
- Groups together attributes with the same name.
- Assigns input attributes to unified attributes.
- Uses the same name for a unified attribute as the name of the input attribute. You can also change the name.
- Create a list of unified schema attributes for the table ahead of time, and then map attributes from the input datasets to these unified attributes.
Creating a unified schema is often an iterative process, especially as you add new input datasets. For example, as you work with your data, you may find that you would like to add more attributes from new input datasets to help describe a particular entity. Tamr helps automate most of this iterative schema mapping process.
Updated about a month ago
After you have created the basic schema, run Tamr machine learning to improve your schema. Or, if the schema is fully mapped (as it may be for a categorization or mastering project), you can proceed to formatting the schema for the project.
|Working with Unified Attributes|