Any unified attribute will have a configuration describing how it is used in machine learning. This configuration is described by a configuration object that can be created, read, deleted, or updated.
Field | Description | Type |
---|---|---|
id | The deployment-specific ID of the attribute configuration returned. Unique across the entire deployment of Tamr. | String |
relativeId | The relative ID of the attribute configuration returned. | String |
relativeAttributeId | The relative ID of the unified dataset attribute to which the configuration applies. | String |
attributeRole | Optional. The specific role, if any, for which the associated attribute is to be used. Possible values include: SUM_ATTRIBUTE (i.e. spend) or CLUSTER_NAME_ATTRIBUTE (i.e. supplier). Each specific role can only apply to a single attribute in a unified dataset. The default is no specific role. | String |
similarityFunction | The similarity function to use for the unified dataset attribute. Possible values include: COSINE , JACQUARD , ABSOLUTE_DIFF , and RELATIVE_DIFF . ABSOLUTE_DIFF is the default. | String |
enabledForML | Whether the unified dataset attribute is being included in machine learning operations. | Boolean |
tokenizer | The tokenizer used for tokenizing text values. Possible values include: DEFAULT , STEMMING_EN , BIGRAM , TRIGRAM , BI-WORD , and REGEX . Either this field or numericFieldResolution may be specified, but not both. | String |
numericFieldResolution | Indicates how to process numeric values. Either this field or tokenizer may be specified, but not both. | Array[int] |
attributeName | The user-specified name of the attribute. | String |
{
"id": "unify://unified-data/v1/projects/1/attributeConfigurations/1",
"relativeId": "projects/1/attributeConfigurations/1",
"relativeAttributeId": "datasets/5/attributes/City",
"attributeRole": "",
"similarityFunction": "COSINE",
"enabledForMl": true,
"tokenizer": "BIGRAM",
"numericFieldResolution": [],
"attributeName": "City"
}