How To Extract Tiers 1 and 2 as Separate Columns in a Categorization Project?

In some situations, you may want to retrieve tiers 1 and 2 categories under which a record is categorized.

This is how you do that:

  1. Create a new Schema mapping project

  2. Add<categorization_project>_unified_dataset_classifications_with_data as the source input dataset to the project

  3. On the Schema Mapping page, bootstrap all the attributes. Create new Columns <new_column1> and <new_column2> by clicking "Create" on the right panel of the schema mapping page. For the eg, let's consider <new_column1> as level1 and <new_column2> as level2

  1. Run "Update Unified Dataset"
  2. The column "suggestedClassificationPath" contains all the tiers either separated by '|' or ',' depending on transformations in the original categorization project.
  3. On the "Unified Dataset" page, on the right panel, click "Add transformation" and select type "script" under "TRANSFORMATIONS ON UNIFIED DATASET"
  4. In the blank under the script, please enter the following transformation:
    a. If the tiers are separated by '|' (pipe character), add:
SELECT *, split(to_string(suggestedClassificationPath), '\\|') AS splitPath;
SELECT *, array(get(splitPath,0)) AS level1, array(get(splitPath,1)) AS level2;

b. Or, if the tiers are separated by ',' (comma), add:

SELECT *, array(get(suggestedClassificationPath,0)) AS level1,
array(get(suggestedClassificationPath,1)) AS level2;
  1. You could test your transformation changes by clicking "Preview".

  1. After testing and making sure your transformation changes work, please click on the "Done" button and then click "save changes"

And you’re done! This populates the two new columns where each column contains tier1 and tier2 respectively.