HomeGuidesAPI ReferenceChangelog
HomeGuidesTamr API ReferenceTutorialsEnrichment API ReferenceSupport Help CenterLog In

Single-Node Deployments

In single-node deployments, the Tamr platform is deployed on a single server.

Tamr requires deployment to a dedicated server and does not support deployments where the server on which Tamr is installed also runs other applications (as of v.2019.013). Only Tamr and its dependencies can run on this server. This applies to on-premises or cloud deployments, where Tamr is deployed on a physical server or a virtual machine (VM) environment.

Important: Tamr cannot operate on disks that are >80% full.

Specifications for On Premises Deployments

Tamr can run entirely on a single server.

Deployment Resource Specification
Large CPU 32 cores
Memory 256 GB
Disk 5TB SSD
Medium CPU 16 cores
Memory 128 GB
Disk 2TB SSD
Small (minimum) CPU 8 cores
Memory 64 GB
Disk 1TB SSD

XFS is the recommended filesystem for all Tamr deployments.

Specifications for Cloud Deployments

AWS Sizing and Limits

Tamr recommends the following configurations for single-node deployments on AWS.

Deployment Recommended Sizing AWS EC2 Capacity/Storage*
Large CPU 32 cores r5.8xlarge
Memory 256 GB
Disk 5TB SSD 5TB EBS SSD
Medium CPU 16 cores r5.4xlarge
Memory 128 GB
Disk 2TB SSD 2TB EBS SSD
Small CPU 8 cores r5.2xlarge
Memory 64 GB
Disk 1TB SSD 1TB EBS SSD

*Recommended AWS capacity or better

For more information, see Amazon EC2 On-Demand Pricing.

GCP Sizing and Limits

Check your Google Compute Engine (GCE) resource quota limits. For more information, see the Google Resource quotas.

Tamr has the following minimum requirements for a single-node deployment:

  • 3 CPU cores and 64GB RAM.
  • For up to 20 million records, Tamr recommends an n1-highmem-8 instance deployment.
  • For larger numbers of records, Tamr recommends n1-highmem-16 or n1-highmem-32 instance deployments.

Tamr recommends the following configurations for single-node deployments.

Deployment Recommended Sizing GCP Type/Storage Option*
Large CPU 32 cores N2-highmem-32
Memory 256 GB
Disk 5TB SSD 5TB pd-balanced
Medium CPU 16 cores N2-highmem-16
Memory 128 GB
Disk 2TB SSD 2TB pd-balanced
Small CPU 8 cores N2-highmem-8
Memory 64 GB
Disk 1TB SSD 1TB pd-balanced

*Recommended GCP capacity or better

For more information, see GCP Machine Types and Persistent disks in the GCP documentation.

Azure Sizing and Limits

Tamr recommends the following configurations for single-node deployments.

Deployment Recommended Sizing Ev3 Series/Storage*
Large CPU 32 cores Standard_E32_v3
Memory 256 GB
Disk 5TB SSD Premium/Standard SSD 5TB
Medium CPU 16 cores Standard_E16_v3
Memory 128 GB
Disk 2TB SSD Premium/Standard SSD 2TB
Small CPU 8 cores Standard_E8_v3
Memory 64 GB
Disk 1TB SSD Premium/Standard SSD 1TB

*Recommended Azure capacity or better

For more information, see the Microsoft documentation about the Ev3 series and disk types.

PostgreSQL Deployment Requirements

For single-node Azure and GCP deployments, we only support installing Postgres on the same server as Tamr.

For single-node AWS deployments, we recommend installing Postgres on the same server as Tamr. If required, you can install Postgres on a separate AWS RDS Postgres instance using the Tamr AWS RDS Terraform module. If deploying via RDS, you must follow our terraform module instructions in Deploying Tamr on AWS and ensure that there is a route between the Tamr VM and the RDS network.

NGINX Deployment Requirements

NGINX is a reverse proxy server configured to allow clients to access Tamr securely over HTTPS, and is a critical component in the Tamr network security layer. For more information, see Requirements for NGINX version support, Installing NGINX, and Configuring HTTPS.

For non-production environments configuring a firewall (below), NGINX, and HTTPS are strongly recommended but not required.

Important: If you do not configure a firewall, NGINX, and HTTPS in a non-production deployment, all users on the network will have access to the data. Use a unique password for this deployment.

Firewall Requirements

For cloud deployments on AWS, Azure, or Google Cloud Platform, use the firewall provided by the cloud provider. This allows you to have control over and visibility into the firewall from the cloud console. See the following:

For on-premise VM deployments, use the firewall provided by the operating system.

Firewall configuration requirements:

  • Allow only internal access to Tamr default port 9100 (via TCP).
  • Open port 443 for HTTPS, with a restrictive IP range that you specify using IPv4 addresses in CIDR notation, such as 1.2.3.4/32.
    Note: If you plan to forward HTTP traffic to HPTTS, also open port 80.

Included Services and Ports

Securing Tamr Ports

When configuring the firewall, only open port 443 and any ports required by your system administrators to outside traffic. Tamr default port 9100 should be accessible ONLY inside the firewall.

Note: If you plan to forward HTTP traffic to HTTPS, also open port 80.

Tamr Microservices

A single-node deployment provides access to these microservices at the following default ports. In addition, they are available via a proxy at the default Tamr port (TCP 9100).

Optional administrative ports can be found at each of the above ports +1. For example, an administrative port for 9020 is found at 9021. They include endpoints for operational information and ensure that a heavy user request load cannot prevent administrative requests from getting through.

Service Default Port Description
Auth 9020 User authentication
Dataset 9150 Dataset management
Data Movement Service 9155 Dataset movement between Tamr and cloud storage destinations
Dedup 9140 Deduplication service for mastering projects
Match 9170 Low Latency Match services
Persistence 9080 Database persistence
Preview 9040 Spark runner for preview of mappings and data transformations
Public API 9180 Public APIs for working with Tamr
Recipe 9190 Orchestration service for tracking tasks and their dependencies
Taxonomy 9400 Taxonomy service for classification projects
Transform 9160 Transformation service for schema mapping and running transformations
Unify 9100 Tamr front-end application

External Services

Service Ports Description
Elasticsearch 9200, 9300 Elasticsearch
Elasticsearch Front End 9130 Elasticsearch for Tamr front-end application
elasticsearch_exporter 9135 Instrument Elasticsearch for Prometheus metrics gathering
elasticseach_logging 9250, 9350 Elasticsearch for logging
Grafana 31101 Monitoring dashboard
graphite_exporter 31108, 31109 Spark metrics for Prometheus
Kibana 5601 Logging dashboard
node_exporter 9110 System metrics for Prometheus
postgres_exporter 31187 PostgreSQL metrics for Prometheus
PostgreSQL 5432 Internal database for Tamr application metadata
Prometheus 31390 Monitoring and alerting framework
ZooKeeper 21281 Tamr HBase ZooKeepe client
HBase 16010 HBase Master
HBase 9113 HBase Master Exporter
HBase 60010 HBase Master JMX
HBase 9114 HBase Region Server Exporter
HBase 60030 HBase Region Server
HBase
configured by TAMR_HBASE_ZK_CLIENT_PORT
2181 HBase ZooKeeper (as distinct from Tamr HBase ZooKeeper)
YARN 8088 (HTTP)
8090 (HTTPS)
YARN Resource Manager dashboard
YARN 8031, 8032, 8033, 8042 YARN Resource Manager Resource, and its admin and tracker ports
YARN 8030 YARN Resource Manager scheduler

Did this page help you?