This feature is currently in ALPHA.
If you would like to know more, please reach out to Tamr support.
Before starting, make sure you install the API Client for your language.
Start by importing the API Client library and authentication provider:
import unify_api_v1 as api from unify_api_v1.auth import UsernamePasswordAuth
Next, create an authentication provider and use that to create an authenticated client:
# replace with your credentials auth = UsernamePasswordAuth('username', 'password') unify = api.Client(auth)
We hardcode the credentials in the code snippet for simplicity.
In production, you should read in your credentials securely via config file or environment variables.
By default, the client tries to find the Unify instance on
To point to a different host, set the
host argument when instantiating the
For example, to connect to
unify = api.Client(auth, host='10.20.0.1')
The API Clients expose some top-level collections such as
You can access these collections through the client and loop over their members with simple
for-loops. E.g. for
for project in unify.projects: print(project.name)
If you know the identifier for a specific resource, you can ask for it directly.
Top-level collections expose
by_* methods (e.g.
DatasetCollection.by_name) to fetch specific resources by those IDs. E.g. fetching a specific project by its relative ID:
relative_id = "projects/1" # replace with your relative ID project = unify.projects.by_relative_id(relative_id)
Models, not data
Note that the API Clients return models rather than the JSON data exposed by the RESTful HTTP API.
You can access a representation of the data via the
.data accessor on a Model object (e.g.
project.data) if necessary, but we recommend you use the Model objects directly.
Models often have relations to other models (e.g. a Project has a Unified Dataset).
You can access related models via convenience methods.
For example, to get the Unified Dataset for a particular Project, you could determine the Dataset ID for the Unified Dataset and then fetch it. But it's much easier to simply use the accessor from the
Project Model object:
project = unify.projects.by_relative_id("projects/1") ud = project.unified_dataset()
Some methods on Model objects can kick-off long-running Unify operations.
Here, kick-off a "Unified Dataset refresh" operation:
operation = project.unified_dataset().refresh() assert op.succeeded()
By default, the API Clients expose a synchronous interface for Unify operations
You can opt-in to an asynchronous interface via the
asynchronous keyword argument for methods that kick-off Unify operations.
operation = project.unified_dataset().refresh(asynchronous=True) # do asynchronous stuff while operation is running operation.wait() # hangs until operation finishes assert op.succeeded()