BackUp Failed Error SnapshotDoesNotExistException
Note: This workaround should only be applied for certainly derived datasets (e.g. a ‘grouped entities’ dataset) and not for any other dataset. Contact [email protected] if you have questions on whether or not this article applies to you.
Problem: Backup fails due to error SnapshotDoesNotExistException.
{
…
"errorMessage": "org.apache.hadoop.hbase.snapshot.SnapshotDoesNotExistException: org.apache.hadoop.hbase.snapshot.SnapshotDoesNotExistException: Snapshot '_2021_2D_05_2D_06__19_2D_53_2D_44_2D_869_CUSTOMER_SITE_MASTERING_unified_dataset_dedup_grouped_entities' doesn't exist on the filesystem\n\tat org.apache.hadoop.hbase.master.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:273)\n\tat org.apache.hadoop.hbase.master.MasterRpcServices.deleteSnapshot(MasterRpcServices.java:521)\n\tat org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:58583)\n\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)\n\tat org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)\n",
…
}
No definitive cause has been discovered yet.
Releases Affected: First observed on v2020.20, might still recur on recent versions
Releases Fixed: N/A
Resolution/Workaround:
Phase I:
- Stop Tamr.
- Drop the table and recreate it:
a. You can see the table name from the error message. In the example above it is -CUSTOMER_SITE_MASTERING_unified_dataset_dedup_grouped__entities
b. Open the hbase shell via
${TAMR_HOME}/hbase-1.3.1/bin/hbase shell
and then execute the following commands - replacing the table name below with the table from your error message:
drop
'tamr:CUSTOMER__SITE__MASTERING__unified__dataset__dedup__grouped__entities'
create
'tamr:CUSTOMER__SITE__MASTERING__unified__dataset__dedup__grouped__entities', {NAME=>'C', VERSIONS=>2147483647, KEEP_DELETED_CELLS => false, COMPRESSION => 'SNAPPY' }
Phase II:
Let Tamr repopulate the data:
- Update the unified dataset of that project.
- Set TAMR_DEDUP_DISABLE_INCREMENTAL configuration variable to TRUE. See instructions here.
"disableIncrementalDedup": true
- Generate pairs.
- Update results.
- Publish clusters.
- Set TAMR_DEDUP_DISABLE_INCREMENTAL configuration variable back to FALSE. Follow the same instructions described here.
"disableIncrementalDedup": false
Updated about 2 years ago