User GuidesAPI ReferenceRelease Notes
Doc HomeHelp CenterLog In
User Guides

Categorizing Records

Overview

Once you upload the data and map it to attributes in the unified dataset, you can begin classifying it.

To start, you can search and filter to find records that match a specific category, and then label these by selecting New Categorization.

You can assign records to other users, and then accept or contest their categorizations. If your project has multiple Reviewers, they can upvote or downvote an assigned category to indicate agreement. A Curator then reviews the votes and choses a winning categorization, known as the verified categorization. Tamr learns only from the verified categorization.

1412

Records that have been categorized.

Once reviewers have assigned category labels to at least three transactions for each unique category, and configured each dataset to indicate which fields should be used for the classifier, you can select Update Categorizations. This launches the classification job in Tamr for the remaining records. For its categorization job, Tamr will not use those categories that do not have user-generated suggestions.

Adding New Categories

If you identify a category that is missing and that needs to be added to the taxonomy, you can add it from the Parts page.

To add a new category to an existing taxonomy:

  1. On the Parts page, select a record by checking the box to the left.
  2. From the top menu, choose New Categorization, or choose Add categorization in the Categorization column. The dialog for adding a new category displays. The "plus" sign allows you to add a category at whichever tier it is missing. For example, if you add a category to a third tier with missing parent nodes, you can also add a node at each level.
  3. Add a new category and choose Save.
2750

Adding new categories to a taxonomy.

Configuring the Parts Page

You can customize what you see in the Parts page for easier labeling. Use the filter tab to specify which records will be displayed. For example, you can choose preferences about user responses, Tamr responses, and the datasets from which the records originated.

You can also specify which columns will be displayed, and in which order, using the gears icon in the upper right corner.

Reviewing Categorization Results

In the taxonomy review page, the display on the right side shows details about each of the categories at different tiers.

In the following example, Tamr offers a report on the number of records that have been classified for the Midwest node, the number of records that were suggested by Tamr, and the amount of total spend in this category. You can also observe the category hierarchy: the category's parent (USA) and children (the states in the Midwest). This view is organized by the number of records, but you can also organize it to display categories in the order of total spend.

Note that the dark green part of the bar under each category indicates finalized categorizations, and the light green bar indicates Tamr suggestions.

2816

Reviewing categorization results.