Balan
Balan

Reputation: 393

Can we configure Marklogic database backup on S3 bucket

I need to configure Marklogic Full/Incremental backup in the S3 bucket is it possible? Can anyone share the documents/steps to configure?

Thanks!

Upvotes: 1

Views: 563

Answers (1)

Mads Hansen
Mads Hansen

Reputation: 66783

Yes, you can backup to S3.

You will need to configure the S3 credentials, so that MarkLogic is able to use S3 and read/write objects to your S3 bucket.

MarkLogic can't use S3 for journal archive paths, because S3 does not support file append operations. So if you want to enable journal archives, you will need to specify a custom path for that when creating your backups.

Backing Up a Database

The directory you specified can be an operating system mounted directory path, it can be an HDFS path, or it can be an S3 path. For details on using HDFS and S3 storage in MarkLogic, see Disk Storage Considerations in the Query Performance and Tuning Guide.

S3 Storage

S3 requires authentication with the following S3 credentials:

  • AWS Access Key
  • AWS Secret Key

The S3 credentials for a MarkLogic cluster are stored in the security database for the cluster. You can only have one set of S3 credentials per cluster. You can set up security access in S3, you can access any paths that are allowed access by those credentials. Because of the flexibility of how you can set up access in S3, you can set up any S3 account to allow access to any other account, so if you want to allow the credentials you have set up in MarkLogic to access S3 paths owned by other S3 users, those users need to grant access to those paths to the AWS Access Key set up in your MarkLogic Cluster.

To set up the AW credentials for a cluster, enter the keys in the Admin Interface under Security > Credentials. You can also set up the keys programmatically using the following Security API functions:

  • sec:credentials-get-aws
  • sec:credentials-set-aws

The credentials are stored in the Security database. Therefore, you cannot use S3 as the forest storage for a security database.

if you want to have Journaling enabled, you will need to have them written to a different location. Journal archiving is not supported on S3.

The default location for Journals are in the backup, but when creating programmatically you can specify a different $journal-archive-path .

S3 and MarkLogic

Storage on S3 has an 'eventual consistency' property, meaning that write operations might not be available immediately for reading, but they will be available at some point. Because of this, S3 data directories in MarkLogic have a restriction that MarkLogic does not create Journals on S3. Therefore, MarkLogic recommends that you use S3 only for backups and for read-only forests, otherwise you risk the possibility of data loss. If your forests are read-only, then there is no need to have journals.

Upvotes: 3

Related Questions