BeeOnRope
BeeOnRope

Reputation: 64925

Couchdb disk size 10x aggregrate document size

I have a couchdb with ~16,000 similar documents of about 500 bytes each. The stats for the db report (commas added):

"disk_size":73,134,193,"data_size":7,369,551

Why is the disk size 10x the data_size? I would expect, if anything, for the disk size to be smaller as I am using the default (snappy) compression and this data should be quite compressible.

I have no views on this DB, and each document has a single revision. Compaction has very little effect.

Here's the full output from hitting the DB URI:

{"db_name":"xxxx","doc_count":17193,"doc_del_count":2,"update_seq":17197,"purge_seq":0,"compact_running":false,"disk_size":78119025,"data_size":7871518,"instance_start_time":"1429132835572299","disk_format_version":6,"committed_update_seq":17197}

Upvotes: 1

Views: 440

Answers (1)

Akshat Jiwan Sharma
Akshat Jiwan Sharma

Reputation: 16000

I think you are getting correct results. couchdb stores documents in chunks of 4kb each (can't find a reference at the moment but you can test it out by storing an empty document). That is min size of a document is 4kb.

Which means that even if you store a data of 500 bytes per document couchdb is going to save it in chunks of 4kb each. So doing a rough calculation

17193*4*1024+(2*4*1024)= 70430720

That seems to be in the range of 78119025 still a little less but that could be due to the way files are stored on the disk.

Upvotes: 2

Related Questions