NiGHTS
NiGHTS

Reputation: 23

What is the recommended procedure to purge out all non-current data from a CouchDB database?

Say I have a database with 100 records, each with 1000 revisions, and an additional 100,000 deleted documents each with an extensive history of revisions. In addition we also have a view document and some mango indexes.

For this hypothetical situation let's assume I can't delete and rebuild the database. Also replication safety is not a concern.

If I am required to create some kind of script utilizing curl to purge the database of all unused data so that the result of running this script is exactly the same as deleting and rebuilding the database with only 100 records with a single revision on-file, how should I go about doing this?

Upvotes: 0

Views: 407

Answers (1)

fynnlyte
fynnlyte

Reputation: 1019

For your hypothetical situation, you could do the following:

  1. Make a backup of the 100 required documents
  2. Delete all documents in the DB
  3. Use the Purge API to delete the revision history
  4. Re-Create the 100 required documents

A safer approach for saving disk space and BTree size in a real-life scenario would be:

  1. Properly configure CouchDB's compaction settings to not include too many revisions
  2. Only purge documents that won't ever be modified again in the future.

Upvotes: 1

Related Questions