Error “failed to create output schemas for input schemas” with geospatial data

Sometimes when mastering records with geospatial data, the Generate Pairs or other jobs fail with an error like the below, mentioning “failed to create output schemas for input schemas.”

A common cause is that Tamr has been configured to apply a geospatial similarity metric (e.g. Hausdorff Distance) to an attribute that has not been configured as a geospatial attribute. For example, the “Caused by” paragraph below indicates that Tamr tried to convert the attribute “input-geometry” to a geospatial type, but that attribute is configured as the default “string” type.

This mismatch can occur when applying a geospatial similarity metric in the blocking model or in the pair classifier.

When you see an error like this, check that all geospatial attributes in the unified dataset have been configured as Geospatial Attributes, and then try the job again. The job usually succeeds upon rerun. For more information, please see the dedicated section on working with geospatial data in our public docs.

com.tamr.workflow.ComponentException: Node dataset-Test3_unified_dataset_dedup_features-205:innerSpecNode:pipeline:convertFieldTypes failed to create output schemas for input schemas: {data={"type":"record","fullySpecified":true,"fields":[{"name":"input-admin_tags_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-elevation_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-feature_id_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-name_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-other_tags_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-source_crs_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-type_default_features","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-areafeatures","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-bboxfeatures","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-lengthfeatures","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"input-geometryfeatures","nullable":true,"type":{"type":"array","elementType":"string"}},{"name":"sourceId","nullable":false,"type":"string"},{"name":"entityId","nullable":false,"type":"string"},{"name":"originSourceId","nullable":true,"type":"string"},{"name":"originEntityId","nullable":true,"type":"string"}]}}

	at com.tamr.workflow.spark.DagRunner.invokeComponent(DagRunner.java:77)
  at com.tamr.workflow.spark.DagRunner.lambda$accept$0(DagRunner.java:47)
  at com.tamr.collections.Dag.reduce(Dag.java:405)
  at com.tamr.workflow.spark.DagRunner.accept(DagRunner.java:46)
  at com.tamr.workflow.spark.DagRunner.accept(DagRunner.java:24)
  at com.tamr.pipeline.driver.PipelineDriver.run(PipelineDriver.java:89)
  at com.tamr.pipeline.driver.PipelineDriver.main(PipelineDriver.java:172)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:58)
  at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: com.tamr.workflow.ComponentException: Error during validating type conversion.
  at com.tamr.workflow.specs.TypeConversionComponentSpec.transformDataSchema(TypeConversionComponentSpec.java:97)
  at com.tamr.workflow.specs.AbstractFlatMappingSpec.getOutputSchemas(AbstractFlatMappingSpec.java:53)
  at com.tamr.workflow.spark.DagRunner.invokeComponent(DagRunner.java:75)

... 12 more

Caused by: java.lang.UnsupportedOperationException: Cannot convert field 'input-geometry__features.(array value)' from type "string" to type {"type":"record","fullySpecified":true,"fields":[{"name":"point","nullable":true,"type":{"type":"array","elementType":"double"}},{"name":"multiPoint","nullable":true,"type":{"type":"array","elementType":{"type":"array","elementType":"double"}}},{"name":"lineString","nullable":true,"type":{"type":"array","elementType":{"type":"array","elementType":"double"}}},{"name":"multiLineString","nullable":true,"type":{"type":"array","elementType":{"type":"array","elementType":{"type":"array","elementType":"double"}}}},{"name":"polygon","nullable":true,"type":{"type":"array","elementType":{"type":"array","elementType":{"type":"array","elementType":"double"}}}},{"name":"multiPolygon","nullable":true,"type":{"type":"array","elementType":{"type":"array","elementType":{"type":"array","elementType":{"type":"array","elementType":"double"}}}}}]}