Common Questions on Core Connect

The most frequently asked questions about Core Connect from our customers are answered in this article.

Q. What is the difference between Ingest and Batch for loading data into Tamr?
A. The batch end-point is used to load multiple tables into multiple Tamr datasets. Say you have a system that has data partitioned across 100s of tables (for various reasons, e.g. in the Science space results of experiments are stored usually in separate tables for each experiment). Batch will allow you to submit a JSON referencing 100s of tables and the corresponding target datasets to transfer data in one call.

Q. Today, we load data in chunks of 10k records. Should we consider using batch?
A. Batch end-points are for batching SELECT statements to load data from multiple tables. The references in JSON to batchInsertSize is a reference to how export data should be pooled together before being sent to the target database. Increasing batchInsertSize will usually reduce the chatter (the amount of calls made from connect to target) at the expense of increased memory consumption.