Reputation:
I found the place where the CouchDB 3.1.0 database is stored on MAC but found non of the attached documents as files. /Users/jacobidiego/Library/Application Support/CouchDB2/var/lib/couchdb
As I am just starting to know about this, it was expecting to find files on it, like jsons documents, it is a json document database after all, but instead I found .couch files storing the database, opening it with a text editor it is a mix a binary-data and the attached jsons contents. Seems like it is not true-json at the backend.
I have created a test database, loaded some jsons, and the attached, check the database size, then attached a 46.9Mb file (the couchdb installer itself) and look the database size grow to 47Mb.
This is undesired for my applications where attachments can be quotations, images, manuals, etc linked directly from the HTML output.
Is there any way to tell CouchDB to handle attachments as individual files instead of inside database.couch file?
I don't need to store multiple revisions of binary files
I don't think it would be compressed into the database.couch file at all, it would be just there
I don't think CouchDB would be able to search inside attachments either and I don't need it.
I don't think that CouchDB would ever detect duplicated attachments
I do need serving through HTTP and replication to nodes
I do want to rename the file after uploading it
I do want to avoid duplicated files
I do want to avoid a 3rd layer, like an apache http server with php, as I am aiming to achieve 2-tier applications as simple as possible
I have searched the web, but found many many articles of old and obsoleted comments, then I don't know if this is imposible at the current version.
Upvotes: 0
Views: 2434
Reputation: 3737
As you've discovered, CouchDB uses json but does not store individual json documents as files in the file system. There are very good reasons for this, and does not mean that CouchDB is any less a json document store. CouchDB uses its own clever b-tree like file format rather than having to rely on the file system which would have been very inefficient for its map-reduce indexing.
As you've also discovered - there are costs and trade-offs associated with storing binary blobs in CouchDB, and if you use it primarily (or significantly) as a block-store, you're not making the most of it. Occasional, small attachments are perfectly fine.
A good pattern is to store attachments elsewhere and store only meta-data in CouchDB. There is no built-in support for this, but the principle is straight-forward: when your client-side code creates a document with an attachment, it then first stores the attachment where ever that makes sense (a file on a file system, an S3 bucket or your block-store of choice), and once this operation has completed, store the reference (file path or S3 key, or ...) in the document json.
On fetching this document, the client then reads the attachment reference and fetches the attachment from the block-store.
Extra client-side work, yes, and obviously you'd still need access to a block store.
At Cloudant we explored elements of this in various blog posts. The actual techniques used won't be directly applicable to you, but you may find the discussion helpful.
Upvotes: 3