Tamr Core must be deployed on a dedicated server. Only Tamr Core and its dependencies can run on this server. This requirement applies to on-premises and cloud-native deployments, where Tamr Core is deployed on a physical server or a virtual machine (VM) environment.

Tamr does not support deployments in which Tamr Core is installed on a server that also runs other applications.

Important: Tamr Core cannot operate on disks that are more than 80% full. Starting with release v2022.001.0, Tamr Core validation scripts verify that at least 20% disk space is available.

Sizing Guidelines

In general, Tamr Core performance scales linearly with computational resources. The specifications that follow use the following "t-shirt" sizes:

Large: 1M to 10M records
Medium: 100k to 1M records
Small: up to 100k records

Additionally, a larger deployment is typically required for projects with over 200k training labels. Contact your Tamr representative to discuss training label estimates and optimizations for your deployment.

For larger volumes, Tamr offers cloud-native deployments that can be scaled to meet your needs.

Specifications for On-Premises Deployments

Tamr Core can run entirely on a single server.

Deployment	Resource	Specification
Large	CPU	32 cores
	Memory	256 GB
	Disk	5TB SSD

Medium	CPU	16 cores
	Memory	128 GB
	Disk	2TB SSD

Small (minimum)	CPU	8 cores
	Memory	64 GB
	Disk	1TB SSD

XFS is the recommended file system for all Tamr Core deployments.

Specifications for Cloud Deployments

AWS Sizing and Limits

Tamr recommends the following configurations for single-node deployments on AWS.

Deployment	Recommended Sizing		AWS EC2 Capacity/Storage*
Large	CPU	32 cores	r6a.8xlarge
	Memory	256 GB	r6a.8xlarge
	Disk	5TB SSD	5TB EBS SSD

Medium	CPU	16 cores	r6a.4xlarge
	Memory	128 GB	r6a.4xlarge
	Disk	2TB SSD	2TB EBS SSD

Small	CPU	8 cores	r6a.2xlarge
	Memory	64 GB	r6a.2xlarge
	Disk	1TB SSD	1TB EBS SSD

*Recommended AWS capacity or better. Do not use an AWS Graviton processor for the VM that runs Tamr Core.

For more information, see Amazon EC2 On-Demand Pricing.

GCP Sizing and Limits

Check your Google Compute Engine (GCE) resource quota limits. For more information, see the Google Resource quotas.

Tamr recommends the following configurations for single-node deployments.

Deployment	Recommended Sizing		GCP Type/Storage Option*
Large	CPU	32 cores	N2-highmem-32
	Memory	256 GB	N2-highmem-32
	Disk	5 TB SSD	5 TB pd-balanced

Medium	CPU	16 cores	N2-highmem-16
	Memory	128 GB	N2-highmem-16
	Disk	2 TB SSD	2 TB pd-balanced

Small	CPU	8 cores	N2-highmem-8
	Memory	64 GB	N2-highmem-8
	Disk	1 TB SSD	1 TB pd-balanced

*Recommended GCP capacity or better

For more information, see GCP Machine Types and Persistent disks in the GCP documentation.

Azure Sizing and Limits

Tamr recommends the following configurations for single-node deployments.

Deployment	Recommended Sizing		Ev3 Series/Storage*
Large	CPU	32 cores	Standard_E32_v3
	Memory	256 GB	Standard_E32_v3
	Disk	5TB SSD	Premium/Standard SSD 5TB

Medium	CPU	16 cores	Standard_E16_v3
	Memory	128 GB	Standard_E16_v3
	Disk	2TB SSD	Premium/Standard SSD 2TB

Small	CPU	8 cores	Standard_E8_v3
	Memory	64 GB	Standard_E8_v3
	Disk	1TB SSD	Premium/Standard SSD 1TB

*Recommended Azure capacity or better

For more information, see the Microsoft documentation about the Ev3 series
and disk types.

PostgreSQL Deployment Requirements

For single-node Azure and GCP deployments, Tamr only supports installing PostgreSQL on the same server as Tamr Core.

For single-node AWS deployments, Tamr recommends installing PostgreSQL on the same server as Tamr Core. If required, you can install PostgreSQL on a separate AWS RDS PostgreSQL instance using the Tamr AWS RDS Terraform module. If deploying via RDS, you must follow the Terraform module instructions in Deploying Tamr on AWS and ensure that there is a route between the Tamr Core VM and the RDS network.

NGINX Deployment Requirements

NGINX is a reverse proxy server configured to allow clients to access Tamr Core securely over HTTPS, and is a critical component in the Tamr Core network security layer. For more information, see Requirements for NGINX version support, Installing NGINX, and Configuring HTTPS.

For non-production environments, configuring a firewall (below), NGINX, and HTTPS is strongly recommended but not required.

Important: If you do not configure a firewall, NGINX, and HTTPS in a non-production deployment, all users on the network will have access to the data. Use a unique password for this deployment.

Firewall Requirements

For cloud deployments on AWS, Azure, or Google Cloud Platform, use the firewall provided by the cloud provider. This allows you to have control over and visibility into the firewall from the cloud console. See the following:

AWS: Getting Started with AWS Network ACLS in the AWS documentation
Azure: Deploy and Configure Azure Firewall Using the Azure Portal in the Azure documentation
GCP: VPC Firewall Rules Overview in the GCP documentation

For on-premises VM deployments, use the firewall provided by the operating system.

Firewall configuration requirements:

Implement least-privilege principles. Block all traffic to Tamr host by default and only allow the specific traffic you need. This includes limiting the rule to just the protocols and ports you need.
If you are restricting traffic to Tamr based on IP addresses, try to minimize the number of rules. It's easier to track one rule that allows traffic from a range of 16IPs than it is to track 16 separate rules. Specify using IPv4 addresses in CIDR notation, such as 1.2.3.4/32.
Allow only internal access to Tamr Core default port 9100 (via TCP).
Open port 443 for HTTPS.
Note: If you plan to forward HTTP traffic to HTTPS, also open port 80.

Included Services and Ports

Securing Tamr Core Ports

When configuring the firewall, only open port 443 and any ports required by your system administrators to outside traffic. For example, port 22 is often used for SSH access.

The host firewall configuration can be configured through your cloud provider console, or through the host’s operating system.

For reference:

Note: If you plan to forward HTTP traffic to HTTPS, also open port 80.

Tamr Core Microservices

A single-node deployment provides access to these microservices at the following default ports. In addition, they are available via a proxy at the default Tamr Core port (TCP 9100).

Optional administrative ports can be found at each of the above ports +1. For example, an administrative port for 9020 is found at 9021. They include endpoints for operational information and ensure that a heavy user request load cannot prevent administrative requests from getting through.

Service	Default Port	Description
Auth	9020	User authentication
Dataset	9150	Dataset management
Core Connect	9050	Dataset movement between Tamr Core and cloud storage destinations
Dedup	9140	Deduplication service for mastering projects
Match	9170	Low Latency Match services
Persistence	9080	Database persistence
Preview	9040	Spark runner for preview of mappings and data transformations
Public API	9180	Public APIs for working with Tamr Core
Recipe	9190	Orchestration service for tracking tasks and their dependencies
Taxonomy	9400	Taxonomy service for classification projects
Transform	9160	Transformation service for schema mapping and running transformations
Unify	9100	Tamr front-end application

External Services

Service	Ports	Description
Elasticsearch	9200, 9300	Elasticsearch
Elasticsearch Front End	9130	Elasticsearch for Tamr Core front-end application
elasticsearch_exporter	9135	Instrument Elasticsearch for Prometheus metrics gathering
elasticseach_logging	9250, 9350	Elasticsearch for logging
Grafana	31101	Monitoring dashboard
graphite_exporter	31108, 31109	Spark metrics for Prometheus
Kibana	5601	Logging dashboard
node_exporter	9110	System metrics for Prometheus
postgres_exporter	31187	PostgreSQL metrics for Prometheus
PostgreSQL	5432	Internal database for Tamr Core application metadata
Prometheus	31390	Monitoring and alerting framework
ZooKeeper	21281	Tamr Core HBase ZooKeepe client
HBase	16010	HBase Master
HBase	9113	HBase Master Exporter
HBase	60010	HBase Master JMX
HBase	9114	HBase Region Server Exporter
HBase	60030	HBase Region Server
HBase configured by `TAMR_HBASE_ZK_CLIENT_PORT`	2181	HBase ZooKeeper (as distinct from Tamr HBase ZooKeeper)
YARN	8088 (HTTP) 8090 (HTTPS)	YARN Resource Manager dashboard
YARN	8031, 8032, 8033, 8042	YARN Resource Manager Resource, and its admin and tracker ports
YARN	8030	YARN Resource Manager scheduler