Demonoid
Demonoid

Reputation: 41

Clickhouse: How to take incremental backup of Clickhouse db, for both partitoned and unpartitioned tables?

Need to take the backup of Clickhouse DB, Some of the tables are partitioned and some are not.

If possible please provide some script or code for better understanding.

Upvotes: 4

Views: 7831

Answers (3)

Prakhar Nigam
Prakhar Nigam

Reputation: 658

Please check the official page first Official Doc's

There are multiple ways of taking the backup. but they all have their own short comings.

Method 1

Genral Steps for Dump of data :

for the list of formats refer this.

clickhouse-client --query="SELECT * FROM table FORMAT Native" > table.native
Native is the most efficient format 
CSV, TabSeparated, JSONEachRow are more portable: you may import/export data to another DBMS.

Dump of metadata:
clickhouse-client --query="SHOW CREATE TABLE table" --format=TabSeparatedRaw > table.sql

Restore of metadata:
clickhouse-client < table.sql

Restore of data:
clickhouse-client --query="INSERT INTO table FORMAT Native" < table.native

Method 2

  • The ALTER TABLE ... FREEZE PARTITION command appears to simply create a local snapshot of a partition

    ALTER TABLE ... FREEZE PARTITION

Method 3

This tool is of very easy ClickHouse backup and restore with S3 support Easy creating and restoring backups of all or specific tables you can write your queries and cron jobs, Support of incremental backups on S3.

Simple cron script for daily backup and uploading

 #!/bin/bash
BACKUP_NAME=my_backup_$(date -u +%Y-%m-%dT%H-%M-%S)
clickhouse-backup create $BACKUP_NAME
clickhouse-backup upload $BACKUP_NAME

Method 4

  • Binary copy of data directory.

    /var/lib/clickhouse/

For non-replicated tables: Stop the server, rsync/scp/etc... its data directory, start the server.

Make sure that file access rights and ownership are correct.

Upvotes: 9

dumer
dumer

Reputation: 308

https://github.com/AlexAkulov/clickhouse-backup

Tool for easy ClickHouse backup and restore with S3 and GCS support. Incremental backups supported too.

Upvotes: 3

Sunil Sunny
Sunil Sunny

Reputation: 571

There are different options possible with different complexity and tradeoffs:

  1. You can have some sort of generic queue like Kafka, data from which is put to both ClickHouse for realtime reports and some different "cold" storage for backups (S3, HDFS, etc.).
  2. You can backup individual partitions like described here [LINK]:https://clickhouse.yandex/docs/en/query_language/alter/#backups-and-replication
  3. You can have additional ClickHouse replica that has very large disk volume and does not participate in live queries, then use ZFS snapshots on it.

Upvotes: 1

Related Questions